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DIFFUSION AND SIMULTANEOUS CHEMICAL REACTIONS: 
Il. THE EQUATIONS OF THOSE SYSTEMS IN WHICH TRANS- 
PORT OCCURS FROM ONE REGION TO AN ADJOINING RE- 
GION 

D. G. O’SULLIVAN 


CouRTAULD INSTITUTE OF BIOCHEMISTRY 
THE MippLEsEex Hospitat, Lonpon, W. 1., ENGLAND 


A method is described for obtaining the solutions to the equations of systems in which a 
reacting substance diffuses between two adjacent regions. This substance may be produced or 
removed in the two regions at rates expressible as polynomial functions of time and may be 
removed throughout both regions by a first-order reaction. The solutions are obtained from 
simpler results, many of which are available and more of which are listed in this paper. Possible 
application in the study of the validity of cytochemical staining procedures is discussed. 


I. Description of the systems. Consider two adjacent space regions of 
any shape where initially regions / and 2 contain uniform concentrations, 
C, and C2 respectively, of a substance which then undergoes diffusion from 
one region to the other. The diffusing substance may partition between 
the regions so that, provided no other resistance to the flux exists, c; = 
ac, for t > 0 at the interface, where c;(x, y, 2, t) and c3(x, y, z, ¢) represent 
the concentrations in regions / and 2 respectively. If an interfacial resist- 
ance, e.g., a membrane of permeability P, is present then equations (1) be- 
low form the interfacial boundary conditions. Other boundaries will exist 
for both regions but they must be such that no net diffusion occurs across 
them. Thus for each region there will be an additional boundary at which 
either 0c’/dn = 0, 7 being the normal to the boundary surface, or c’ will 
be finite for any finite time. The former condition will apply to physical 
boundaries such as the walls of a containing vessel and the latter is more 
convenient for mathematical boundaries such as the central point of a 
spherical region or at infinity. 

The functions c; and ¢; for any constant initial concentrations, C, and 
C2 in regions / and 2 respectively, may readily be formulated from slightly 
simpler results that are more likely to be available. For example, if solu- 
tions are available for the appropriate problem with initial concentra- 
tions C and zero in regions / and 2 respectively, then c; is obtained by 
taking the region / solution, replacing C by Ci — aC and adding the re- 
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sulting function to aC2. The required c, is obtained from the region 2 so- 
lution by replacing C by C; — aC: and adding the resulting function to 
C,. Appropriate solutions are available for regions of many shapes, par- 
ticularly in those special cases when a = 1 (e.g., Byerly, 1893; Jacobs, 

1935; Carslaw and Jaeger, 1947; O’Sullivan, 1954). 
Thus we define c;(x, ¢) and c;(x, #) to be respectively solutions to the 

equations 
0G 


aI DV oc, 


y = 1,2 referring to regions / and 2 respectively; with initial conditions 


cr (x, 0) =C,, 


where C; and C, are constant parameters; interfacial boundary conditions, 
either 


6. = ac 
with 
Oa 065 
D, ae D, a7 
or 
/ if 
D, c= Dy 52 =P (acb= ch), (1) 


n being the normal to the interface, directed into region 2, and P being the 
permeability of the interface; and “outer” boundary conditions, either 


0c 


ay ans 


or ¢, being finite for finite time at all points on the “outer” boundaries, 
7 being the normal to the appropriate surface. 

In addition, we now introduce three new symbols y,, ¥,,, and y,9; 
v = 1, 2 referring to the two regions, these being functions c, to which 
we have assigned special values for the parameters C; and C2. Thus y, is 
the function c, for the particular initial concentrations with which one is 
concerned in a definite problem; y,; is c,, with C, given the value unity 
and C, the value zero; and y,; consists of the function c,, when C; is zero 
and C; is unity. Thus the three sets of functions y,, ¥,1, and y,2 obey the 
differential equations and boundary conditions for c, and only differ in 
that the initial concentrations C; and C2 possess specific values. 
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IT. Transformations. If the solutions c’ for any diffusion system that is 
included in those described in Section I are available, the functions y,, 
W1, and y,. can be obtained immediately. Then take the same system 
with the initial concentrations used in formulating y, and with the follow- 
ing chemical reactions occurring in addition to the diffusion process. The 
diffusing substance is considered to be produced or removed, per unit 
volume, in regions / and 2 at rates which are expressible as polynomial 
functions of time and to be removed throughout the system by a first- 
order reaction with velocity constant k. If c,(x, t) represents the concen- 
trations of diffusing substance in regions / and 2, then the system will 
obey the equations 


Ny 
0 Ca 3} n 
Vyas Div Gigs Roy+ = Alas ; (2) 
F = 
C n 
a SVs C3 — Root 2 Bal ; (3) 


with the same initial and boundary conditions as for the functions y,. 
The solutions c, may readily be obtained from c, by use of the trans- 
formations 


Cy =» exp (— ht) + Ss Annt fo. ; fe exp (— kl) (di) 
n=0 
(4) 


Ne 
+ Lanf ; fe exp (— At) (diy. 


There is no restriction on the actual values of the initial concentrations. 
If they are both zero, then y, will be everywhere zero for all time and the 
terms y, exp (— At) will disappear from the transformations. The velocity 
constant & may be positive, zero, or negative but must, of course, have a 
constant value throughout both regions. The constant coefficients A, and 
B, may be any real numbers, including zero. 

Application of the operations D,V’ and (k+ 0/ at) to the functions OF 
utilizing the facts that y,, ¥,,, and y,, all obey the differential equations 
for functions c’, results in proof that functions ¢, obey equations (2) 
and (3). It can also be directly proved that c,, defined by these transfor- 
mations, obey the initial and boundary conditions considered in this paper 
(for method, see O’Sullivan, 1954, 1955). 
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IIT. Rate of uptake of diffusing substance by region 2. If no reactions 
are involved, the rate at which a diffusing substance will enter region 2 


is defined by 
R'= ~ff 28 as=—f f 2 as, 


n being directed into region 2. When reactions are occurring, the rate, de- 
fined by similar expressions involving ¢; and ¢2, becomes from transfor- 
mation (4) 


Ny 
R=Wexp(—k) + 3 Ann fo... f Wexp(— kD (dN 
n= 0 0 
ar (5) 
t t 
La yer 
+ DeBant fi... f Ye exp kt) (dt)"*4, 


where W is the function R’ with specified initial concentrations, VY is R’ 
for initial concentrations unity in region / and zero in region 2, and WV con- 
sists of R’ in which values of zero and unity have been substituted for the 
initial concentrations in regions J and 2 respectively. 

IV. Steady-state conditions. Many systems in which diffusion with re- 
action takes place, the concentration at any point in the system tends to 
a constant value with increasing time. Steady-state solutions to the diffu- 
sion equations are naturally simpler than those for the transient state and 
may frequently be obtained from the original differential equations with 


O'Gs 


ie 


Also, if the transient state solutions are known, simply letting ¢ tend to 
infinity will provide steady-state solutions, if the latter exist. 
Now a steady state will arise if the equations 


0c 

es DS eke Ao 
and 

0 

= DV. tse k to +Bo 


apply to regions / and 2 respectively, where & is positive and either A, or 
By may be zero. The transformations giving c; and c2 in these circum- 
stances are, from equations (4), 


Gy = Wy exp (— Rt) + Af vn exp ( — kt) a+By f gle exp (— kf) dt. 
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If ¢ tends to infinity, the integrals, assuming they exist, become the La- 
place transforms of the appropriate concentrations with the constant & 
replacing p (Carslaw and Jaeger, 1947; Danckwerts, 1951; O’Sullivan, 
1955). Thus for the steady-state concentrations 


Gy = Agdi (RB) + Boro (Bh) . (6) 


If the Laplace transforms c/(p) happen to be available, the required 
steady-state solutions may be written down immediately in the most con- 
venient form. 

V. Extension to three regions. As biological membranes are occasionally 
considered as lipoid layers of definite thickness, it is worth while to con- 
sider briefly the case when region / adjoins region 2, which in turn has a 
common interface with yet another medium in region 3. Let the interfacial 
and “outer” boundary conditions be any of the types described earlier 
for two regions. If we know the solutions to the three equations 


00 
ot 


=DVc, where v=1, 2, 3, 


with c/ = C, initially, constants for the three regions, then the solutions to 


OlGy 
ot 


for the same initial and boundary conditions are given by the transfor- 
mations 


Gr = pr exp (— kt) +m f vn exp (— kt) dt 


a DVa=ro- My 


t 
+ ms J Yor exp (— ki) dit mf Ys exp (— Rt) di, 


where y, is the appropriate solution c, in which the C,’s are given the 
desired specific values; y,, consists of c, with the initial concentrations, 
unity in region pw and zero in the other two regions. 

VI. Some examples of functions c,. As many examples of solutions ¢, are 
not available when a is not necessarily unity and when D, and D, are not 
necessarily identical, a few additional results are listed here. 

(a) Plane slab in an infinite medium. Region 1 is defined by0 <# <1 
and region 2 by x > J. An interfacial resistance is present at « = 1. Here 


= aCe +=ravP? f f+ exp(— D,tu*) -sin lu-cos xu-du, 
via 


b= Cot mp f (ME idl = 5 Oy Boe D,tu’) ‘sin lu-du, 
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where 
D\? 
n=() =O ahs, 


: 2 Di DEO ree 2 
=u{ (P coslu— D,u sin lu) +P av; sin’ lu} 


and 
fi=P cos lu+cos{ (« —D vu} — Dy sin lu-cos{ (« —1) ru} 
— Par, sin lu-sin{ (x —1) ru}. 


(b) Plane slab in finite medium. Region J is defined by 0 < x < J, and 
region 2 by 1, < x < l2. No interfacial resistance is present. Here 


{=aL= 2x >> Ts exp (= Dy E21) -cos v&,° sin { (lg —l) ns}, 


s=i 


ch=L+ 21>) forexp (— Dy &3) «sin hEs+cos { (la — x) £5}, 
s=] 
where 


_ Ci, + Cols — Coh 
ly — @ = a) 1, : 


+> Pity (= 1 = iy cone eee 
Uy 2, 
— {(—h) af +h }sin (lp —1,) m&,-8in Ls] , 


and +&, (s = 1, 2, 3,...) are the roots, all real and symmetrical with 
respect to the origin, of the equation 


sin (J, — 1) 1 & + cos d,é + ar, cos (J, — ) 4, £ + sin hE = 0 


(see Carslaw and Jaeger, 1947b). 
(c) Case (b) with a surface resistance at x = h: 


c{ = al — 2PX>> fs exp (— D, 650) «sin (Is —h) 1 0,- cos 2 8,, 
s=1 


ch=L+ 2PX>° fs exp (— Dy O31) «cos (lp — x) 1 0,-sin 118, « 
s=1 
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where 
1 
Tra GF [Pry {l2— (Gl — a) 1,}cos (dy — 11)», 05° COS 1,05 
= { dD, ely + Par’, C = Ly) }sin C ae l,) Vy 05° sin L he 
= dD, (Ly a 1) Vy 0, cos (Ly a ik) y,0,°sin 1; 0, 
+ Dy, 6, sin (ls = L,) Vy 0. *cos Ll, 6.| 
and +9, (s = 1, 2, 3, .. .) are the roots of 


(D,@ tan 1,6 — P) tan{ (2 —1,) 0} = Pay, tan 1,0 


(d) Sphere with no surface resistance set in infinite medium. Region 1 is 
defined by 0 < r < aand region 2 byr > a. Here 


2 
ci, = aC, + — a” oy (d) ik uf, exp (— D,tu?) + sin rudu , (7) 
2 
=C,+ BoE fafs exp (~ Dylu?) du, (8) 

where 

oe sin @u— au cos au Sad 

Gnd) [p?a?u? sin? au + { av?au cos au-+ (1 — ap*) sin au}?] 

and 


fs=v,au cos(r— a@)vyussin au 


+sin(r— @)v,u+ { av?au cos au+ (1 — ay?) sin au}. 


VII. Example of application of the transformations. Consider a sphere 
(0 <r < a), with no surface resistance, set in an infinite medium. A 
substance, initially absent throughout the system, is produced uniformly 
in the sphere at rate m (e.g., mol/cc/sec). This substance diffuses out 
of the spherical region and, throughout the system, is simultaneously re- 
moved by a first-order reaction of rate kc. The concentration patterns 
may be obtained by applying transformations 


t 
c= m fi vu exp(— kd) dl for O<r <a, 


t 
c= m f vn exp(— ki) di for PSS wh 
0 F 
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obtained from equations (4), to the results quoted in Section VI(d). The 
function Wu is obtained by equating C, to unity and C2 to zero in equa- 
tion (7) and wo, is similarly obtained from equation (8). 

Thus the required solutions are 


Imaa fr d 
C= mee ff ufafe sin rudu (9) 
and 
ga ( aredes (10) 
rr Jy 
where 


21 ex (Dw + k) t} 


I Dw + k 


Application of the transformation 
t 
R= WV — kt) dt 
m I 1 exp ( ) 


obtained from equation (5) gives the rate at which the diffusing substance 
passes out of the spherical region. Now 


a 
O17 /,=a 


= 8a°D, a f uf, exp(— D,tu*) (sin au— au cos au) du 


Y= 4ra’D, ( 


and, consequently, 
R= 8a°Diavim f ufafe (sin @u— aucos au) du. 
0 


In this problem a steady state is reached and application of the result 
c, = my,,(k), obtained from equation (6), gives the steady-state concen- 
trations (cf. Rashevsky, 1948). 

VIII. A pplication to cytochemistry. Although the living cell is in a dy- 
namic state, the organization is such that permanent structural features 
exist and evidence shows that, to a first approximation, certain functional 
aspects can be associated with definite regions in the cell. As many proc- 
esses involve enzyme-catalyzed reactions, the determination of intra- 
cellular enzymic topography is important and since 1935 much study has 
been devoted to the precise mapping of the distribution of high molecular 
weight substances in tissues and in individual cells. Investigations have 
proceeded along several independent lines, one of the most important po- 
tentially being the staining techniques of histo- and cytochemistry. Here 
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colored regions are produced in tissue slices which purport to map the en- 
zymic distribution. In skilled hands the best of these methods produce re- 
sults which on microscopic examination show very sharply defined regions 
and no evidence of crystallinity in the colored deposit. In certain examples 
(e.g., Gomori, 1939, 1941, 1948a, b) a preliminary deposit, obtained follow- 
ing the enzymic action, is subjected to further treatment which results, 
eventually, in its replacement by a readily visible deposit. The Gomori 
methods have been subjected to much critical appraisal (e.g., Danielli, 
1946) and a mathematical study by Johansen and Linderstrém-Lang 
(1951, 1952, 1953; Carlsen, Jensen, and Johansen, 1953) has been ren- 
dered inconclusive because of the use of unsuitable data on the supersatu- 
ration of calcium phosphate solutions. The fact that the staining is only 
produced as a result of several processes means that several stages must 
prove on scrutiny to be above suspicion. The Gomori methods have, how- 
ever, been the most widely and successfully applied cytochemical tech- 
niques. In other methods, color production occurs directly as a result of, 
and simultaneously with, substrate removal. Examples are azo-dye meth- 
ods (e.g., Nachlas and Seligman, 1949), methods based on thiocholine 
(Koelle, 1950), myristoylcholine (Gomori, 1948a), and on substituted in- 
doxyl acetates (Holt and Withers, 1952; Holt, 1952, 1954; Barrnett and 
Seligman, 1951). For these, fresh tissue is treated either by a fixation pro- 
cedure or by freeze drying and then slices of frozen tissue are incubated 
in a solution containing a chromogenic substrate together with a develop- 
ing agent, the latter being present in large excess. In one of the variations 
of Holt’s method, for example, the tissue is immersed in a solution con- 
taining the substrate, 5-bromoindoxy] acetate, together with the develop- 
ing agent potassium ferricyanide with potassium ferrocyanide. Certain 
esterases in the tissue effect hydrolysis of the ester and the 5-bromoindoxyl 
produced gets rapidly oxidized in the presence of the ferricyanide giving 
5:5’-dibromoindigo. 

For a cytochemical method of this type to be successful it is highly de- 
sirable that the following conditions should be satisfied: 

(a) preparatory processing and subsequent treatment of the tissue 
should not affect the enzymic action or distribution, 

(b) rapid penetration of substrate and developing agent should occur 
to all cellular levels, 

(c) substrate should interact specifically with one enzyme or group of 
enzymes or special type of enzymic activity, 

(d) the developing agent should not affect the enzymic processes or the 
penetration of the substrate, 
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(e) the product of the enzyme-catalyzed reaction should react very rap- 
idly with the developing agent and the velocity constant should not be 
affected by different cellular environment, 

(f) the colored product should be immediately deposited, 1.e., should 
have very low solubility, should not supersaturate, should deposit in an 
amorphous or sub-microcrystalline state, and should be stable, and 

(g) no preferential adsorption of any of the substances involved should 
occur in special localities which are unrelated to the enzymic distribution. 

Now a method might be satisfactory for qualitative or even quantita- 
tive application if all the above conditions are not obeyed, but it is clear 
that the processes that occur in any particular case must be clearly under- 
stood in order to assess the validity of the method. Completely false lo- 
calizations may be obtained for a variety of reasons, e.g., preferential 
adsorption of the substances involved in special regions. General adsorp- 
tion within the tissue may occur and will be pronounced if a mechanism 
like hydrogen bonding to proteins is involved. This does not necessarily 
invalidate a process but introduces complicating features; thus penetra- 
tion of substrate or developing agent may be retarded but, on the other 
hand, diffusion of intermediate and final products may be very much re- 
stricted. 

If investigation shows that adequate localization occurs, then its pre- 
cision might be tested by the use of the following model. Consider the en- 
zyme to be uniformly distributed in a spherical site of radius a in an in- 
finite medium. Henceforth in this paper the term “site’’ must be con- 
sidered as being defined in this way. The value of a has to be selected, 
e.g., an arbitrary value of 1 » could always be taken or a suitable value 
or set of values for a particular tissue could be assessed experimentally. 
With the rapidity at which penetration should occur it is likely that the 
product of the enzymic action will be produced uniformly throughout the 
site by a zero-order reaction. It is removed from the site by diffusion and 
by reaction with the developing agent. Assuming that under the condi- 
tions involved the removal reaction is first order, then equations (9) and 
(10) will give the concentration mapping of the diffusing substance. If one 
takes the diffusion coefficient in the tissue as D and assumes that the 
media inside and outside the site are identical, then these equations may 
be simplified as D, = D; = D,», = 1, and a = 1. The results for c, and cy 
take the same form, so that throughout the system 


Fe dal tian dt 
¢=— @ sin ru> [1 —exp (— 6¢)] du (11) 


TwrJo 


DIFFUSION AND SIMULTANEOUS CHEMICAL REACTIONS 209 


where 6 = Du? + kand ¢ = (sin aw — aw cos au)/u5. This result can, of 
course, be arrived at by other methods and if desired can be expressed in 
terms of sums and products of exponential and error functions. Assuming 
that the product is immediately deposited, its “concentration” at any 
point x will be given by 


t 
c= mk fi c(x, t) dt, (12) 


where 2, molecules of product are formed from a single molecule of diffus- 
ing substance. Thus the “concentration” or “density” mapping of the 
colored substance is obtained. If localization is absolute, this should be 
zero forr > a. 

Before using the equations developed in this section it must be ascer- 
tained that, to a reasonable approximation, the diffusion obeys Fick’s 
Law, the removal reaction is first-order and the diffusing substance is 
produced by a zero-order reaction. Values for D and & are then required. 
In deciding the validity of a cytochemical process it is possible to avoid the 
use of m, but if desired its value may be assessed by measuring the total 
enzymic activity of a tissue slice of known volume and the volume of the 
stained regions in an adjacent slice and then calculating the enzymic ac- 
tivity per unit volume of the stained regions. 

It is desirable to construct a measure of the theoretical degree of locali- 
zation of a cytochemical method. A number could then be assigned to a 
process to give some indication of its merit. Marked differences exist in 
the measures obtained from a variety of possible definitions; thus not all 
are dependent on the value of m. For practical utility the significance of 
the units should be immediately clear, whilst theoretically it is desirable 
that the number concerned should be easily calculated and that it should be 
independent of m and a. The following method would be suitable from the 
viewpoint of practical utility. After assessing the minimum dye density that 
can be distinctly appreciated by the eye, then the value of 7 at which the 
dye would have attained this density, in the minimum incubation time re- 
quired to give a “good picture,” could be computed. The ratio a/r would 
be a measure of the success of the cytochemical process. This, however, 
is unsatisfactory because severe computational difficulties are involved 
and the result is dependent on both m and a. 

The ratio of the mass of colored deposit in the site to total mass of de- 
posit arising from the activity of the site in a given time would provide an 
alternative definition. The total mass of deposit is not readily calculated, 
but could be replaced by the mass that would be produced in a certain 
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time if the product of the enzymic action were completely converted into 
the colored product. The resulting measure F has a significance that 1s 
readily appreciated—it adopts a very low value for poor localizations and 
a value approaching unity for good localizations; also it is independent 
of m. Under circumstances when equation (11) satisfactorily describes the 
concentration behavior of the product of the enzymic action then 


a 
4x f r-cadr 
0 


$7 a* mny,t 


f= 


which from equations (11) and (12) becomes 


ie ¢2{exp(— 64) + 6t—1}du. 

Before cytochemical results can be accepted without criticism, thor- 
ough investigation is necessary in order to demonstrate that “real” locali- 
zation occurs. When this has been established, studies of the kinetics of 
the enzyme-catalyzed and diffusible-product-removal reactions, together 
with knowledge of the appropriate diffusion coefficient, provide a means of 
assessing the precision of the localization. Investigations along these lines 
are in progress in these laboratories. It is probable that the equations given 
in this section are applicable in many cases. Otherwise suitable results are 
made available by transformations (4). If the enzyme-catalyzed reaction 
is not zero-order, the rate of production of diffusible product will be ex- 
pressible with sufficient accuracy as a polynomial function of time. Other 
complications may occur in special cases. For example, if actually de- 
posited dye is further removed by some chemical side reaction of zero- 
order, then the resultant effect might be to raise the value of the localiza- 
tion index F. 


The author is indebted to Dr. S. J. Holt for many discussions on cyto- 
chemical methods. 
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The types of mathematical model which have been used to represent all-or-none behavior 
in the nerve membrane may be classified as follows: (1) the discontinuous threshold phenomenon, 
in which differential equations with discontinuous functions provide both a discontinuity of 
response as a function of stimulus intensity at threshold and a finite maximum latency, (2) the 
singular-point threshold phenomenon which exists in a phase space having analytic functions in 
its differential equations and having a singular point with one characteristic root positive and 
the rest with negative real parts, the latency being unbounded, and (3) the quasi threshold 
phenomenon, which has a finite maximum latency and continuous functions, but neither a true 
discontinuity in response nor an exact threshold. Several models of the nerve membrane in the 
literature are classified accordingly, and the applicability of the different types of threshold 
phenomena to the membrane is discussed, including an extension to a stochastic model. 


INTRODUCTION 


The presence of a threshold phenomenon in a biological system im- 
poses restrictions on the types of mathematical model suitable to describe 
that system. This paper is concerned mainly with threshold phenomena in 
the nerve fiber membrane and was inspired to a great extent by the 
mathematical models proposed by G. Karreman (1951) and A. L. Hodgkin 
and A. F. Huxley (1952). A mathematical classification of threshold phe- 
nomena will be given and then used to classify several models of the nerve 
membrane and of the iron wire model of nerve which have been proposed 
by various authors. 

Figure 1a shows a typical picture of the changes of potential (V) across 
the membrane of a single giant nerve fiber of the squid, recorded between 
an external electrode and an axial internal electrode (Hodgkin, Huxley, 
and Katz, 1952). Brief current shocks of different intensities (z) were ap- 
plied ending at time ¢ = 0. During the interval ¢ > 0 a uniform zero cur- 
rent flow across the membrane was maintained by the external circuit. 
The form of each curve depends on the initial state of the membrane at 
t = 0. This initial state varies continuously with z, and the curves of Fig- 
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ure 1a may therefore be described formally by an equation of the follow- 
ing form: 
V=hi (1) 


in which z appears as a parameter. As the stimulus z is increased beyond 
a threshold value z,, shape of the curve changes suddenly. As an extrapo- 
lation from the experimental data, one assumes that if the curves corre- 
sponding to all values of z (within some finite interval Z) were plotted, 
the shapes of the curves would change discontinuously as z passed the 
value z,. In terms of the all-or-none law of physiology, these curves are 
divided into two distinct classes, the “all” and the “none” curves. Within 


Ficure 1. a. Membrane action potentials from squid giant axon, showing the effect of 
small differences in stimulus strength z near its threshold value zg. z > 29 for the upper three 
“ALL” curves; z < 2g for the lower two “NONE” curves. Following a brief shock at zero 
time, a zero membrane current is maintained by the external circuit. (Redrawn from Hodgkin, 
Huxley, and Katz, 1952). 6. Curves of potential measured at two fixed times, zero and h, 
plotted against z, from the curves in a. 


each class, the shapes of the curves vary continuously with z, but there are 
no intermediates between the members of the two classes. Because of ran- 
dom variations in latency, it is impossible to determine from any finite 
number of experiments whether the latency of a nerve fiber, as the stimu- 
lus intensity approaches threshold from above, is bounded or unbounded. 
If the maximum latency of the response is assumed finite (Pecher, 1939), 
there will be some time #; such that if the ordinate F(; z) is plotted against 
z (Fig. 1b, curve “¢ = é,”), there is a discontinuity at some value z, of z. 
This discontinuous curve may be considered as showing the relation be- 
tween stimulus (abscissa) and response (ordinate). However, if the initial 
state F(z; 0) is plotted against z (Fig. 1b, broken line), no discontinuity 
appears. The threshold phenomenon thus involves a “parting of the 
ways,’ at some time between zero and f;, between the courses of behavior 
of the membrane for s < z and those for > gp, at least as reflected in the 
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potential V. This property should be present in any mathematical model 
of the membrane. 

The properties sought belong, strictly speaking, not to the membrane 
by itself, but to the total system consisting of the membrane together with 
that part of its environment which imposes an external electrical con- 
straint upon it, which here consists of the electronic apparatus connected 
to the electrodes. The importance of the environment in helping to deter- 
mine threshold behavior is shown by the fact that such behavior is present 
when the membrane is stimulated by a short current pulse or a step change 
in current, but absent when a step change in potential is applied (Cole, 
1949; Hodgkin, Huxley, and Katz, 1949). The threshold phenomena to be 
considered here include only those in which the stimulus is over at time 
t = 0, and the external constraint is always the same during the response 
(¢ > 0). The case of stimulation of nerve by step currents of different 
strengths will therefore be excluded. 


DEFINITIONS 


Let us assume that the state of the total system at any time may be 
described by a finite number of variables of state x, (w= 1, 2,..., N), 
and that the behavior of the system can be defined by a set of differential 


equations of the form 


Xn 
<n i (La 2 ae) (2) 


or, in vector notation, 
dx 
—_ = 3 
= I(x), (3) 


where the vectors are printed in boldface type. The variables x, may be 
considered as the coordinates of a vector space or phase space of WV di- 
mensions, each point of which corresponds to a single state of the system 
(Minorsky, 1947; Lefschetz, 1948). The state of the system is represented 
at any time by a state point, which moves along a trajectory in phase space 
defined by a solution x(x°; ¢) of equation (3), where x° is the initial point 
for ¢ = 0. During the stimulus the electrical constraint and therefore the 
trajectories of the phase space are not the same as for ¢ 2 0. The point 
reached at ¢ = 0 by the state point as a result of the stimulus is x° and is 
some function x°(z) of z. The initial point is thus under the control of the 
experimenter, who can vary the parameter z at will, before each stimulus 
is delivered. 

All or only some of the x, may be measured experimentally. The mem- 
brane potential V, which is usually measured, may in general be assumed 
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to be some continuous function of the #,. Several authors simply take the 
membrane potential as one of the x,. In the following discussion, defini- 
tions of three different types of threshold phenomena will be formulated in 
terms of the properties of trajectories in phase space, and not just in 
terms_of the behavior of V as a function of time. However, it will always 
be assumed that V is so defined as a function of the x, that the continuities 
or discontinuities of shape between neighboring trajectories are not lost 
when they are converted to curves of V plotted against f. 

The following definition is an attempt to describe a threshold phenome- 
non mathematically. Figure 2 illustrates this definition for a phase plane 


(NONE) 


t=O 


Ficure 2. Diagram of a discontinuous threshold phenomenon in a phase plane. Broken 
line labeled ‘‘¢ = 0” in this and subsequent figures is the locus of initial points resulting when 
the stimulus intensity z is varied. = zg at the point indicated. The two broken lines labeled 
“t = t,” are loci of state points for time f, and correspond to “ALL” and ““NONE” responses. 


(N = 2). The trajectories in the upper right-hand region of Figure 2 
could be filled in, in various ways, or simply left undefined. 


Definition I. 

If x°(z) is continuous in zg over some interval Z, except possibly for a step 

discontinuity at z = 29, and if, for some time #, > 0, x(x°(z); #:) is con- 

tinuous in z except for a step discontinuity at z = z,, then a discontinu- 

ous threshold phenomenon (DTP) will be said to exist in the phase space. 

This definition is designed to describe a threshold phenomenon with a 
bounded latency (,). The discontinuity in the state of the system at time 
ty, as a function of z, is provided either (1) simply by a discontinuity in 
the initial condition at z = zo, with no special conditions on f(x), or 
(2) with the initial condition continuous in z, in which case limitations 
must be imposed on f(x). One may wish both x°(z) of Definition I and 
f(x) of (3) to have as components some of the elementary, differentiable 
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functions of calculus. The justifications for such a choice seem to be that 
(1) physical and chemical processes involving many molecules are usually 
described by such functions, (2) the results of several experiments always 
differ so much that there is a limit to the preciseness to which the func- 
tions should be specified, and therefore the simplest ones should be chosen, 
and (3) to explain a discontinuous process by using discontinuous func- 
tions is to use an ad hoc assumption and dodge the issue. But by the 
Cauchy-Lipschitz theorem for the existence of the solutions of differen- 
tial equations (Lefschetz, 1948) one can show that if f(x) is differentiable, 
with all partial derivatives uniformly bounded in a certain region, the 
solution x(x°; ¢) is continuous in (x°; #) for all ¢. Then if x°(z) is also con- 
tinuous in z, a DTP is impossible. Nevertheless, several authors have 
successfully used D7P’s with discontinuous functions, as will be dis- 
cussed below. 

It should be mentioned that in order to have a DTP with x°(z) continu- 
ous it is not necessary that f(x) be discontinuous, but only that it fail to 
satisfy a Lipschitz condition at some point. For example, a DTP appears 
in the following system :* 


a yl? 
CI sr) 
er ees 


If one wishes to use differentiable functions, it is necessary to set up a 
new definition of threshold phenomenon. One way to revise Definition I 
is to sacrifice the existence of a maximum latency, or finite /;. A point 
of a phase space at which all dx,/dt = 0 is a degenerate trajectory and is 
called a singular point. Figure 3 shows a phase plane with a saddle point 
(one type of singular point) at the origin of coordinates arising from equa- 
tions of the following form: 


oe = PyXi + Piote + gu (41, Xo); | 


(4) 
d 
re = pk) + Poe + go (%1, V2). | 
In this case the #’s are constants, and the characteristic equation 
IN 
Pu Pr x6 (3) 
pu px» uN 


* Suggested by Dr. F. H. Clauser. 
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in \ has one positive and one negative root, and q and g2 are power series 
in x, and x, beginning with terms of degree two or greater. If x°(z) isa 
continuous function of z and describes a line segment such as that labelled 
‘“¢ = (” in Figure 3 as z is varied over the interval Z, the trajectory hav- 
ing x°(z) as its initial point changes its shape discontinuously at a trajec- 
tory called the separatrix, for which z has the value 29. This discontinuity 
is of a different kind from that of Definition I. In fact, according to the 
Cauchy-Lipschitz Theorem, x(x°(z) ; #1) is continuous in z for every fixed h,, 
and as z is varied, x(x°(z); é,) travels continuously along a line such as that 
labelled “¢ = ft,” in Figure 3. Let us arbitrarily define the latency as the 
time required for x to go from its initial point along an “‘all” trajectory to 
some line such as that labelled ‘criterion of excitation” in Figure 3. For a 


—fcriterion of 
excitation 


ALL 


ot ot 


Ficure 3. Diagram of an STP ina phase plane. S.P. is a saddle point. Typical trajectories 
of the “ALL” and “NONE” classes are labeled. See text. 


model of the nerve membrane, for example, this line might correspond to 
the condition that V be halfway between the resting potential and the 
peak value of an action potential. There is now no maximum latency. 
The nearer z approaches 29, the longer the state point subsequently re- 
mains in the neighborhood of the saddle point, where the phase velocity 
vector dx/dt is very small. The latency can in this way be made arbitrarily 
large. This property of the saddle point may be shown by plotting latency 
against z. In Figure 4, typical curves of this kind are diagrammed for a 
DTP and a saddle-point threshold phenomenon (STP). If the latter curve 
were to diverge from the experimental curve (broken line) only for z very 
near %, such a model could be accepted as a good approximate represen- 
tation of the nerve membrane. Both the STP and the real nerve fiber 
may then show increases of latency near threshold which are similar ex- 
cept that the latency of the STP approaches infinity, as z approaches 2», 
while that of the real fiber remains finite. 
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The trajectories in the neighborhood of the saddle point in Figure 3 fall 
into three classes according to their behavior: (1) two trajectories are 
stable, approaching the saddle point as time increases; (2) two others are 
unstable, approaching it as time decreases; and (3) all the rest are hyper- 
bolic, first approaching and then leaving the saddle point as time increases. 
Furthermore, the hyperbolic trajectories can be divided into four sub- 
classes according to the directions along which they approach and leave 
the saddle point. If two trajectories are chosen from any two of these 
different classes or subclasses, it is impossible to deform one into the other 


SUBTHRESHOLD | SUPRATHRESHOLD 


LATENCY —> 


Ficure 4. Diagram of latency as a function of stimulus intensity fora DTP and an STP. See 
text. 


by passing through a continuum of intermediate trajectories. It is this 
topological property which is responsible for the threshold characteristics 
of the saddle point. As z is varied through Z and the initial point x°(z) 
moves along the line “‘t = 0,” it passes discontinuously from one subclass 
of hyperbolic trajectories to another subclass which behaves in a qualita- 
tively different manner for increasing ¢. Both of these subclasses occupy 
contiguous 2-dimensional regions of the phase plane and are separated by 
a single stable trajectory, the separatrix. 

The saddle-point threshold phenomenon may be generalized to a phase 
space of any finite number of dimensions. The following definition is ap- 
plicable to systems with functions f(x) which are analytic at a singular 
point, i.e., can be expanded in a Taylor series about that point. For some- 
what more general conditions, see I. Petrowsky (1934). 
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Definition IT. 

A singular-point threshold phenomenon (STP) will be said to exist in an 
N-dimensional phase space (V > 1) if there exists an isolated singular 
point having one characteristic root positive and all of the others (if 
any) with negative real parts, and if x°(z) is a continuous function 
which intersects and is not tangent to the (V — 1)-dimensional sur- 
face (the separatrix) composed of stable trajectories, for z = 2p. 


In two dimensions, the condition on the characteristic roots defines a 
saddle point. In three dimensions the properties of the singular point can 
be visualized if we let the trajectories be described by the solutions 


Xn = An en! (n — 1, 24 3) (6) 


of the differential equations 


(ee, 
dt 


= MiXn- (7) 


The a,’s are constants of integration and are the coordinates of the initial 
point. The origin of coordinates is a singular point. 

If A. < Ax < 0 < Az, then the plane x3 = 0 (the separatrix) contains 
the stable solutions and divides the space locally into two regions in both 
of which the trajectories are hyperbolic, but in which their behavior for 
increasing ¢ is qualitatively different. Those trajectories for which a3 < 0 
approach the negative «3-axis pointing toward minus infinity on that axis 
and may be taken to represent the “none’’ response. Those trajectories 
for which a3 > 0 approach the positive «3-axis and point toward plus in- 
finity (“all” response). The separatrix separates the “all” from the 
“none” trajectories. However, if \: < 0 < dy < 2s, the hyperbolic tra- 
jectories fall into two subclasses which do not differ qualitatively in their 
behavior for increasing ¢, but approach the plane x, = 0, pointing in all 
possible directions in that plane. In the latter case, therefore, there is no 
threshold phenomenon. 

In general, for an S7’P to exist in any V-dimensional phase space, the 
singular point must have the property that an (V — 1)-dimensional sur- 
face consisting of stable trajectories (the separatrix) forms a local bound- 
ary between two N-dimensional regions both of which consist of hyper- 
bolic trajectories which for large enough # leave the singular point in two 
opposite directions. Tf all the characteristic roots have non-zero real parts, 
then the above conditions are fulfilled if, and only if, the roots are as spec- 
ified in Definition II (Petrowsky, 1934; Lefschetz, 1948). Cases in which 
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some of the roots have zero real parts are more complicated and have to 
be examined separately. 

A second way to revise Definition I is to keep the existence of a maxi- 
mum latency, but sacrifice the discontinuity between the “‘all’’ and the 
“none” trajectories. Figure 5a shows an example in a phase plane. For all 
t, x(x°(z); 4) is continuous in zg, but for some values of é, it varies very 
rapidly when gz is near 2. Figure 5b shows how V = F(t; z) might appear 


<_—_ 


Ficure 5. a. Diagram of a QTP ina phase plane. The shape of the trajectory changes con- 
tinuously as z is varied. The two-dimensional separatrix is cross-hatched. 6. Curve analogous 
to that of Figure 10, but plotted for the case of Figure 5a. 


as plotted against z. The discontinuous curve of Figure 1b has been re- 
placed by a continuous curve with a very rapid rise near z = z,. We may 
describe these properties mathematically as follows: 


Definition ITT. 
If x°(z) is continuous in z over some interval Z, and if there exist a posi- 
tive time ¢, and two values zg, and g2 of such that the ratio 


| x [x° (Z2) 3 4] mmc 3 [x° (21); 4] | 
| Za— 2; | 


is sufficiently large, then a quasi threshold phenomenon (OTP) will be 
said to exist in the phase space. 


This definition is necessarily inexact, since the phrase “sufficiently 
large” is subject to arbitrary interpretation. A QT P may therefore grade 
insensibly into what is for all practical purposes not a threshold phenome- 
non at all. However, OT P’s have been used by several authors, and a cri- 
terion for judging a Q7P, based on statistical considerations, is discussed 
below. 

For the sake of comparison, one may say that the (V — 1)-dimensional 
separatrix of the STP has been replaced in the QTP by a “thin” N-di- 
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mensional neighborhood of an (NV — 1)-dimensional surface, as indicated 
by the cross-hatched region in Figure 5a. The thickness of this neighbor- 
hood, i.e., the magnitude of its smallest dimension, determines the sharp- 
ness of the OTP. The single threshold value z, of stimulus may be con- 
sidered as replaced by the closed interval [z1, 22]. 

The three types of threshold phenomenon mentioned above do not ex- 
haust the possibilities. One can also set up a threshold phenomenon with 
some of the properties of the saddle-point type by using a line or surface 
consisting of singular points instead of an isolated one. Also, a limit cycle 
(periodic closed trajectory), to which some trajectories are stable and 
others unstable, can be substituted for the saddle point. In this case again 
hyperbolic trajectories can be found with any specified latency, no matter 
how large, most of which may be spent with the state point oscillating in 
the neighborhood of the limit cycle. 


EXAMPLES 


Some of the models of excitable surfaces which have been proposed by 
various authors can now be classified according to the type of threshold 
phenomenon present. First, however, it will be necessary to define more 
precisely the conditions of environmental constraint to be imposed on the 
systems under discussion. As mentioned above, an excitable system is not 
isolated from its environment, but linked to it by one or more variables, 
which for a nerve fiber membrane are membrane potential difference and 
membrane current density. Moreover, the potential difference and current 
density in general are not constant everywhere on the membrane, but vary 
from point to point. A nerve membrane may be considered as being made 
up of a large number of elementary areas of molecular dimensions. Over each 
such area the current density and potential difference may be considered 
to have single values, but in different areas they may have different values 
at the same time. If a region in the neighborhood of a stimulating electrode 
were to be described mathematically, it would be necessary to have a 
complete set of variables of state for each elementary area, and the corre- 
sponding phase space would have too many dimensions for convenient 
treatment. Fortunately, the presence of threshold behavior in a membrane 
does not appear to depend necessarily on the interaction of different ele- 
mentary areas, as does the conduction of a nerve impulse. Both the giant 
axon of the squid (Cole, 1949; Marmont, 1949; Hodgkin, Huxley, and 
Katz, 1949) and the iron wire model of nerve (Bonhoeffer, 1941, 1948) 
have been stimulated to produce an all-or-none response uniform over the 
surface—named by A. L. Hodgkin and A. F. Huxley (1952) the “mem- 
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brane action potential’—during which the membrane current is held 
uniformly at zero, following an initial current pulse as stimulus. 


1. Discontinuous Threshold Phenomena 


The models of nerve of N. Rashevsky (1933, 1948) and A. V. Hill (1936) 
are incomplete in the sense that they describe the behavior of the system 
only up until the time that a certain variable of state reaches a threshold 
value. If the subsequent events were to be described by an extended 
model, of the type considered in this paper, either the differential equa- 
tions in these models would have to be changed in order to describe the 
behavior of the system during the response, or new variables of state 
would have to be introduced for this purpose. In either case, since there 
is in these models a maximum time f» after stimulation at which time the 
threshold value can be reached, if it is to be reached at all, there is some 
time 4; (> tm) for which Definition I would apply. 

W. A. H. Rushton’s (1938) model of nerve contains in the equivalent 
circuit of the membrane an e.m.f. which disappears when V reaches a 
certain value during the stimulating shock. This makes x°(z) discontinu- 
ous at z, and there is a DTP. 

In the model of F. Offner, A. Weinberg, and G. Young (1940), when 
the membrane potential V reaches a critical value V., a resistance in the 
equivalent circuit of the membrane decreases discontinuously from its 
resting value to its excited value. This is a DTP with N = 1; x is V, 
x°(z) is continuous, but f(x) is discontinuous at V = V.. 


2. Singular-point Threshold Phenomena 


G. Karreman and H. D. Landahl have described several models of an 
excitable membrane. In the simplest (Karreman, 1951; Karreman and 
Landahl, 1952), V = 1, and the phase space is a line on which there are 
three singular points. The first in succession is stable and corresponds to 
the resting state. The second is unstable and is the site of an STP. The 
third corresponds to a stable excited state; no provision is made for re- 
covery from excitation in this model. As the stimulus z (applied negative 
potential difference) is increased beyond its threshold value, x°(z) passes 
the unstable singular point. Thereafter the state point passes to the excited 
state. 

In a more complicated model, N = 2, and the system can be repre- 
sented on a phase plane with coordinates x and y (Karreman and Lan- 
dahl, 1952, 1953). This model shows different mathematical properties 
according to the value of the parameter 7, which depends on various physi- 
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cal properties of the membrane. The singular points, at which dx/dt = 
dy/dt = 0, are the points of intersection between two curves called 
isoclines: the vertical isocline is the curve on which dx/dt = 0; the hori- 
zontal isocline, that on which dy/dt = 0. The isoclines are shown in all 
figures with short straight arrows crossing them in a direction which is 
the same as that of the trajectories crossing them. When 7 = 183; there 
are three singular points (Fig. 6). Singular point A corresponds to the 


mS) 
Xx —> a vertical is ented! 


Ficure 6. STP in Karreman and Landahl’s membrane model when the parameter r = 183. 
(Modified from Karreman and Landahl, 1952, 1953). A is the stable resting state, B a saddle 
point, and C an unstable singular point. “sep.” indicates the separatrix. Typical “ALL” and 
“NONE” trajectories are shown, the former only at its beginning and end; the omitted 
central part circles around C. 


stable resting state. Point B is a saddle point at which there is an STP, 
and C is unstable. When 7 has larger values, the threshold phenomenon 
changes to a OTP, as described below. 

Recently M. J. Polissar (in Johnson, Eyring, and Polissar, 1954) has 
presented a mathematical model of the nerve or muscle membrane in 
which NV = 2. The variables of state of the system are in his notation EZ, 
the membrane potential, and the P.D.M., or ‘potential demand of the 
membrane.”” The model is based on a phenomenological picture of the 
membrane in which “‘it is assumed that the change in the transmembrane 
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potential E is the chief factor determining the change in the structure of 
the membrane. ...A given transmembrane potential demands a par- 
ticular state of the membrane. In turn, each instantaneous state of the 
membrane demands a particular value for the transmembrane potential.” 
This situation is then described more specifically as the tendency of E 
and the D.P.M. to vary in certain directions which depend on the values 
of both of them. Polissar represents the state of the system at any mo- 
ment by the position of two ‘‘conjugate points” in a plane having £ and 
the P.D.M. as coordinates (but not a phase plane). Each of these points 
is constrained to move only along a corresponding curve in the plane. The 
position of one point on its curve is determined by the value of £; that of 
the other by the value of the P.D.M. But if we replace these two points 
by a single state point which can range over a phase plane with coordinates 
Land P.D.M., the resulting representation is mathematically equivalent 
to Polissar’s, and allows one to visualize the over-all behavior of the sys- 
tem more easily than does his. The phase plane representation will there- 
fore be used in the following discussion. 

Polissar does not explicitly state differential equations corresponding 
to (2), but gives qualitative rules for the behavior of £ and the P.D.M. 
under the condition of zero membrane current. These rules determine the 
qualitative properties of the trajectories. In Figure 7, the vertical isocline 
is defined by the condition dH/dt = 0, and the horizontal isocline by 
[d(P.D.M.)|/dt = 0. These isoclines happen to be the same lines as those 
along which the two conjugate points move in Polissar’s original represen- 
tation. The two isoclines intersect at three singular points A, B, and C. In 
the region above the vertical isocline dH/dt is positive and is negative 
below it. Above the horizontal isocline [d(P.D.M.)|/dt is negative and 
positive below it. The trajectories which have been sketched in Figure 7 
show that A and C are stable singular points, and B is a saddle point, at 
which there is an STP. Point A corresponds to the resting state. 

Polissar takes our z to be the duration of a stimulating current pulse of 
constant intensity, which we may take as ending at zero time. If z < 2g, 
the initial point will be at a point such as D and will return to A along a 
trajectory such as that labelled “NONE.” If z > 2, the state point will 
move along a trajectory such as that marked “ALL,” going from E to C. 
Point C represents a state of excitation which is stable—this model does 
not describe the process of recovery. However, with the phase plane repre- 
sentation it would be possible to modify this model to show recovery by 
changing C to an unstable singular point, as in Karreman and Landahl’s 
two-dimensional model mentioned above. 
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3. Quasi Threshold Phenomena 
K. F. Bonhoeffer (1948) described the behavior of the iron wire model 
of the nerve fiber qualitatively in a phase plane (V = 2). The coordinates 
« and y are two quantities of which the physical nature is not completely 
specified; « is the “degree of activation” and y is the “refractoriness” of 
the wire. Bonhoeffer did not state his differential equations explicitly, but 
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horizontal 
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Ficure 7. STP in Polissar’s membrane model. (Modified from Johnson, Eyring, and Polis- 
sar, 1954.) See text. 


X—_ 
Ficure 8. QT P in Bonhoeffer’s model of the iron wire model of nerve. (Redrawn from Bon- 
hoeffer, 1948.) See text. 
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represented the isoclines and a few of the trajectories graphically. Figure 8 
shows no singular point in the region of divergence of the “‘all” from the 
“none”’ trajectories for this model. Since the trajectories and isoclines are 
drawn as smooth lines, all functions were presumably intended to be dif- 
ferentiable, and this model may be classified as a OTP. 

In the model (V = 2) of Karreman and Landahl (1952, 1953) discussed 
above, a QT is obtained when the parameter r has a value of 190 or 
greater. The horizontal isocline moves to the right as 7 is increased, points 
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Ficure 9. QTP appearing in Karreman and Landahl’s model when ¢ = 200. (Redrawn 
from Karreman and Landahl, 1953.) 


B and C approach each other, coalesce, and disappear leaving A as the 
only singular point (Fig. 9). 

Hodgkin and Huxley’s (1952) model of the squid giant axon membrane 
is the most complex yet proposed. It has five variables of state: the mem- 
brane current density 7, the membrane potential V, and three variables 
n, m, and h, which determine the potassium and sodium conductances. 
The equations describing the behavior of these variables are of the fol- 
lowing form: 


eres +E am eh One: (8) 


dn _ 9 
Gi mal V)s (9) 
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dm _, (10) 
ap etm V)s 
ED Ono (11) 


dl 


where-Cy, is the membrane capacitance per unit area of membrane and 
the F’s are analytic functions. In the case of the membrane action poten- 
tial, the condition of external electrical constraint during tf > 0 is J = 0, 
the variables of state are reduced to V, 2, m, and h. Equation (8) then 
becomes 


dV 1 . 

CAPs Sey) 12 

FF G, Fy (n, m, h,V) (12) 
Equations (9) through (12) represent the equations (2) for this model, 
and NV = 4. The singular points in this four-dimensional phase space can 
be determined from (9)—(12) by setting 


Sees a= 0) (13) 


There is a line in the five-dimensional space (with coordinates J, V, 2, m, 
h) defined by the four equations which result when (13) is substituted into 
(8)-(11). This line is the locus of all possible singular points that can 
arise in the five-space as a result of applying any arbitrary external 
electrical constraint to J and V. For the membrane action potential, there 
will be as many singular points as there are intersections of this locus with 
the hyperplane J = 0. These may be found by studying the projections 
of the locus and of the hyperplane on the J-V plane. The projection of 
the hyperplane is simply the line J = 0. The projection of the locus is 
described by the following equation, 


I=F, (No, Moy ho, V) 5) (14) 


which is obtained by substituting (13) into (8). Here 2, m,,, and h,, are 
the values assumed by m, m, and # when (13) is substituted into (9)-(11); 
they are analytic functions of V only. The right-hand side of (14) is equal 
to the sum of the steady-state potassium, sodium, and “‘leakage’’ currents 
(Ix, Ina, and J;). These currents are plotted as functions of V in Figure 10. 
The curve labelled ‘‘7”’ is the projected locus of singular points in the J-V 
plane, and intersects the horizontal axis only once, at the origin. This in- 
tersection determines the only singular point, which corresponds to the 
stable resting state. Since there is no saddle point, there is no STP. Since 
all functions are analytic and therefore satisfy the conditions of the Cau- 
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chy-Lipschitz theorem, there is no DTP, and a OTP is indicated. If_a 
QTP exists here, it should be possible to find a value of stimulus which 
would produce a response intermediate between “all” and “‘none”’ for this 
model. K. S. Cole (1954; Cole et al., 1955), however, found a value for the 
threshold stimulus (intensity of a 0.01 msec shock) in this case accurate to 
one part in sixty thousand, with no intermediate responses. By studying 
the projections of trajectories in the four-dimensional phase space onto a 
plane with coordinates V and dV /dt, he concluded that there are two non- 
stable singular points, in addition to the stable singular point mentioned 
above. Certain of these projected trajectories do behave in such a way as 
to suggest the presence of a saddle point, but since the method just de- 
scribed indicates the existence of only one singular point (at least as de- 
fined in the present paper), the stable one, it would appear that the com- 
plete phase velocity vector in the four-dimensional phase space, although 
it may become very small in a certain region, does not vanish completely. 
In such a case, it might be possible to interpret a OTP as a lower-dimen- 
sional “moving saddle point,” in a way similar to A. J. Lotka’s (1925) in- 
terpretation of a quasi-equilibrium state governed by one slowly changing 
variable as a “moving equilibrium,” but this idea will not be pursued 
further here. 

Since the experimentally produced membrane action potential obeys 
the all-or-none law, it might be thought that a QT P would be unsuitable 
for a model of the membrane. Whether or not this is so will depend con- 
siderably on the magnitude of certain statistical variations inherent in 
the assumed physical nature of the Hodgkin-Huxley model, although they 
do not appear explicitly in their differential equations. The mathematical 
formulation of the model could be extended to include these statistical 
variations in the values of the variables of state by assuming that the 
state point travels through phase space with a “Brownian motion” super- 
imposed on an average drift velocity given by the phase velocity vector. 
In most of the phase space, this random motion would be so small com- 
pared to that resulting from the phase velocity vector that the former 
would be negligible in determining the behavior of the system. In the 
neighborhood of the separatrix of a threshold phenomenon, however, the 
random motion might be so large as to decide whether a state point tends 
to follow the “all” or the “none”’ set of trajectories. In a statistical en- 
semble of systems originally in the resting state, all of which are given 
identical stimulus shocks of the same near-threshold intensity, a certain 
proportion of them may give an action potential and the remainder may 
not. A similar result has been found experimentally; a nerve fiber may re- 
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spond to some, but not all of a succession of stimuli, and this is due to 
variability in the behavior of the fiber rather than in the strength of the 
stimulus (Blair and Erlanger, 1933). 

Random motion of the state point in phase space could produce this 
result in any of the three types of threshold phenomenon described earlier. 
It assumes particular importance for the QT P, however, since it provides 
a criterion for judging whether a QT P is sufficiently sharp to portray ade- 
quately an all-or-none process. If the random motion during and following 
the stimulus is of an order of magnitude equal to or greater than the 
thickness of the ‘thin’? V-dimensional separatrix (mentioned earlier), the 
OTP may be considered satisfactory, since the state point will then very 
seldom stay within the separatrix long enough to be carried to a region of 
phase space corresponding to a response intermediate between “all” and 
“none.” Cole’s (unpublished) calculations give the figures — 6.373043 and 
— 6.372943 mV for the values of V immediately following brief stimulat- 
ing pulses which are slightly ‘“‘suprathreshold”’ and ‘‘subthreshold’’ re- 
spectively. The difference between these two values is 0.100 uV, which 
may be taken as the difference in stimulus intensities which appears in the 
denominator of the fraction appearing in Definition III. Time é; can be 
taken to be 7.9 msec, the time of the peak of the action potential following 
a just suprathreshold stimulus. Then the difference between the values 
of V for the “‘all” and “none”’ responses, measured at time th, is 98 mV. 
Therefore the ratio in Definition III will have 9.8 X 10° as a lower limit. 
If it can be shown that fluctuations in V of the order of magnitude of 
0.100 wV can often be expected by chance, then the QT P of the Hodgkin- 
Huxley model is sufficiently sharp for its purpose. 

Fluctuations might be expected in any of the variables , m, h, and V, 
but only those in V resulting from a random e.m.f., similar to that which 
appears across all electrical conductors (Johnson, 1928), will be consid- 
ered. These fluctuations are of the same nature as those considered at the 
end-plate of muscle by P. Fatt and B. Katz (1952). The properties of this 
“noise e.m.f.”’ are independent of the physical nature of the conductor 
and depend only on the value of the resistance and the temperature. The 
mean square noise e.m.f. across any linear, passive impedance in the fre- 
quency range df is given by the formula 


(E*) sdf = AR, kT Of , (15) 


where R; is the real part of the impedance for frequency f, & is Boltzmann’s 
constant, and 7 is the absolute temperature (Nyquist, 1928). Hodgkin 
and Huxley’s equivalent circuit for a unit area of the membrane is a capac- 
itance Cy in parallel with three branches, each consisting of an e.m.f. and 
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a resistor in series. If g denotes the total D.C. conductance of this circuit 
for the resting state, then 

§ 
The total root-mean-square noise e.m.f. over the entire frequency range 


from zero to infinity of a section of membrane of area A in cm? at 6.3° C 
and for Cy = 1.0 wF/cm? is 


The experiments on which Hodgkin and Huxley based their model were 
done using an area of approximately 0.110 cm? (Hodgkin, Huxley, and 
Katz, 1952), which gives a root-mean-square noise e.m.f. of 0.19 uV, when 
substituted into (17). This value includes noise of all frequencies, but noise 
of too low and too high frequencies may be ineffective in causing random 
transitions between “all’”’ and “‘none”’ trajectories. Since the latency of 
the Hodgkin-Huxley model just at threshold is about 6 msec (Cole ef al., 
1955), the most effective frequency range may be estimated as between 10 
and 1000 cycles/sec. Integration of (15) between these limits, assuming 
g = 0.639 mmho/cm?, gives a root-mean-square noise e.m.f. of 0.17 »V in 
the resting state. Moreover, noise in the physical model resulting from 
sources other than the conducting ions would increase this figure. Since the 
figure of 0.17 uV is of the same order of magnitude as the value of 0.100 
uV calculated by Cole for the difference in V between an “‘all” and a 
“none” trajectory, the OTP of the Hodgkin-Huxley model is sufficiently 
sharp for its purpose. 

This discussion has shown that any intermediates between “‘all’”’ and 
“none” behavior in the Hodgkin-Huxley model will appear only when the 
accuracy of specifying the initial conditions is increased beyond the limits 
of uncertainty which appear when the physical interpretation of the model 
is considered. Moreover, the possibility remains that just at the threshold 
of excitation the assumption that J = 0 at each point of the membrane 
fails in the experimental situation. If a slight inhomogeneity of the state 
of the membrane or of the potential distribution could cause an excitation 
to begin locally and spread elsewhere rapidly as a result of current flow 
between different areas, the assumptions of the above analysis would be 
false and its conclusions might no longer be applicable. 

The experimentally obtained static or steady-state J-V curve of a mem- 
brane forms at least one branch of the projected locus of singular points 
mentioned above. The experimental data for this curve for the squid axon 
(circles in Fig. 10) lie along a curve of similar shape to the theoretical 
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locus, crossing the horizontal axis only once, at the origin. S-shaped curves 
of the type shown in Figure 10 of Hodgkin, Huxley, and Katz (1952), 
showing the relation between ionic current density and membrane poten- 
tial measured at a fixed time after shocks of various strengths, do not 
imply the presence of three singular points, although they cross the hori- 
zontal axis (I = 0) in three places, since such a curve is not a projection 


Ficure 10. Components of membrane current plotted against membrane potential V in the 
steady state, for Hodgkin and Huxley’s membrane model. Circles are experimental data 
taken from Figure 13 of Hodgkin, Huxley, and Katz (1952) for the curve “J,’’ which repre- 
sents both the steady-state total membrane current as a function of V and the projection of 
the locus of singular points onto the J-V plane. The single intersection of curve “‘J”’ with the 
horizontal axis excludes an STP. Curves of the potassium, sodium, and leakage current com- 
ponents of J are labeled “Tx,” “Iya,” and ‘‘I”’ respectively. For oval, see text. 


of a locus of singular points. Therefore when such an S-shaped curve is ob- 
tained experimentally, it does not mean that the real system can be de- 
scribed by an STP, as seems to be implied by Bonhoeffer (1953). It might 
be possible to modify Hodgkin and Huxley’s model to contain an STP if 
an additional branch were added to the locus of singular points, as, for 
example, the oval sketched as a broken line in Figure 10. In such a model 
there would be three singular points whenever J took a constant value 
within a certain interval Z which includes zero. An STP would not be 
obtained if J were fixed at any constant value outside of L. 
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DISCUSSION 

In the foregoing classification of threshold phenomena, detailed local 
mathematical properties (‘‘in the small’’) such as differentiability and the 
nature of the characteristic roots of a singular point, play a role which 
may seem exaggerated, in view of the fact that the precision with which a 
mathematical model of a biological process can be made to agree with ex- 
periment is limited by the variability of the data. This objection can be 
answered by saying that as long as exact mathematical expressions are 
used in models of biological systems, the properties and limitations of 
these expressions are worth understanding. But this answer only poses 
the further question whether the differential equation, a type of descrip- 
tion of nature which has been borrowed from physics and chemistry, really 
is appropriate to describe a biological system whose properties have not 
yet been traced to physical and chemical mechanisms. 

The types of threshold phenomena discussed here differ not only in the 
detailed properties of their differential equations, but more generally in 
the disposition of their trajectories in phase space, in particular (1) the 
division of trajectories into distinct classes such that those in each class 
can be obtained one from another by a continuous deformation through 
other members of that class, while the members of two different classes 
are not so connected; and (2) the existence of boundary regions in phase 
space where trajectories of different classes meet. These properties (‘‘in 
the large’) are invariant under continuous, one-to-one transformations 
of the coordinates of phase space and fall within the domain of topology, a 
branch of mathematics which may be intrinsically better fitted for the pre- 
liminary description and classification of biological systems than analysis, 
which includes differential equations (cf. Minorsky, 1947, Introduction; 
Rashevsky, 1954). This suggestion is of little practical value at present, 
since too little is known of the topology of vector fields in many-dimension- 
al spaces, at least to those interested in theoretical biology. Nevertheless, 
the most logical procedure in the description of a complex biological sys- 
tem might be to characterize the topology of its phase space, then to es- 
tablish a set of physically identifiable coordinates in the space, and finally 
to fit differential equations to the trajectories, instead of trying to reach 
this final goal at one leap. 


I wish to thank Dr. Kenneth S. Cole for his interest in this work and 
for allowing me to use unpublished calculations which were made for him 
on the National Bureau of Standards digital computer SEAC by H. A. 
Antosiewicz and P. Rabinowitz. Drs. Earl Coddington and John Moore 
made valuable suggestions during the course of the work. 
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The difference of the topological information content of two reacting molecules and that of 
their reaction products is calculated for several topological types of chemical reactions, illus- 
trating the influence of the structure of the reagents and of the reaction product. It is shown 
that the change in the topological information content in a chemical reaction can be positive 
as well as negative, depending on the way the reagents approach each other and thus on the 
reaction product formed. A quantitative measure of structural specificity is introduced. 


In a very interesting paper N. Rashevsky (1955) has shown how the 
structure or topology of the constituent organic molecules of an organism 
can be taken into account in the determination of their information con- 
tent. Only such molecules are considered by Rashevsky (Joc. cit.) which 
consist of physically indistinguishable atoms and which differ from each 
other only in the number of atoms which they contain and in the topology 


(SSS 
a 


b c 


FicGureE 1 


of their structural formulae. Rashevsky (loc. cit.) has introduced the con- 
cept of the ‘topological information content”’ of a molecule, a quantity 
which to some extent reflects the topological structure of the molecule. 
Of course, frequently in living processes chemical reactions between the 
molecules rather than the molecules themselves play a fundamental role. 
Therefore we will calculate here the change of the topological information 
content in a chemical reaction. As a simple example, consider the mole- 
cules a and 0b in Figure 1. We assume that they form the molecule c in 
Figure 1. The topological information content of each molecule before the 
reaction, a or b, as well as that of the molecule after the reaction, c, is zero 
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(Rashevsky, loc. cit.). Therefore, there is no change in topological informa- 
tion content in this reaction. However that is usually not the case in a 
chemical reaction. To show that, we consider a reaction of the kind illus- 
trated in Figure 2. In Figure 2a the points 1 and 6 are of degree one, 2 and 
5 of degree three, and 3 and 4 of degree two. The probability of a point in 
Figure 2a chosen at random being of one of these three types is 1/3. In 
this way we find for the topological information content 

Inq = loge 3 } - 

= 1.58 bit . 


As we have already seen, the topological information content J of mole- 
cule 20 is zero: 


Isp (0). (2) 


In the molecule 2c, the result of this reaction, the points 1 and 6 are of 
degree one; the points 2 and 5 of degree three, each one being connected 
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FIGURE 2 


with one point of degree one and with two points of degree three; the 
points 3 and 4 of degree three, each one being connected with one point 
of degree two and with two points of degree three; and the points 7 and 8 
of degree two, each one being connected with one point of degree two and 
with one point of degree three. The probability of a point in Figure 2c cho- 
sen at random to be of one of these four types is 1/4. Therefore the 
topological information content of molecule 2c is 


Toe = loge 4 
} (3) 


=2 bits. 
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The change in the topological information content in the reaction of the 
kind illustrated in Figure 2 is therefore: 


Al, = Ing— (Loa + Ins) 
=2— (1.58+ 0) (4) 
== (AD Tih - 


As another example we will consider the reaction illustrated in Figure 3. 
As we have seen above, the molecule 3a, which is the same as the molecule 
2a, has a topological information content of 1.58 bit. In the molecule 3d 
the points 1 and 4 are of degree one and the points 2 and 3 of degree two. 
The probability of a point in Figure 3b chosen at random to belong to one 
of these two types is 3 and, therefore, the topological information content 
of molecule 30 is 
I 3 = logs 2 
\ (5) 


= 1 bit. 


In molecule 3c the points 1 and 6 are of degree one, each one being con- 
nected with a point of degree three; the points 2 and 5 of degree three, 
each one being connected with one point of degree one and with two points 
of degree three; the points 3 and 4 of degree three each one being con- 
nected with one point of degree two and two points of degree three; the 
points 7 and 10 of degree two, each one being connected with one point 
of degree two and with one point of degree three; and the points 8 and 9 
of degree two, each one being connected with two points of degree two. 
The probability of a point in Figure 3c chosen at random to be of one of 
these five kinds is 1/5. Thus we obtain for the topological information 
content of molecule 3c: 


Is. = loge 5 \ 


(6) 
732 pit. | 


The change in the topological information content in a reaction of the kind 
represented in Figure 3 is therefore: 


ATs = Ise— (Iaa+ I) 
= 2.32 — (1.58+1) (7) 
= — 0.26 bit . } 


Of course, as all atoms in the molecules so far considered are chemically 
equivalent and, as we will assume, have the same affinities for each other, 
the same molecules of Figures 3a and 3b may react also as in Figure 4, 
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producing the molecule 4c, giving a different change in the topological 
information content. The topological information content of molecule 4a 
is again 1.58 bit and that of molecule 40 1 bit. But in molecule 4c all points 
are now topologically different and therefore the topological information 
content is 

I4, = log, 10 \ 


Gee, (8) 
sig.d2 Dito 9 
The change in the topological information content is therefore: 
ALn=Iae> (Isa + Lap) | 


13552 =) kOe 1) (9) 


= 0.74 bits j 
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The same molecules can react in still another way which is illustrated 
in Figure 5. Again the topological information content of molecule 5a is 
1.58 bit and that of molecule 5d is 1 bit. In molecule 5c the points 1, 6, 7, 
and 10 are of degree one, each connected with a point of degree three; the 
points 2, 5, 8, and 9 of degree three, each one being connected with one 
point of degree one and two points of degree three; and the points 3 and 4 
of degree three, each one being connected with three points of degree 
three. The probability of a randomly chosen point in Figure 5c being of 
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the first type is 2/5, of the second type 2/5, and of the third type 1/5. 
Thus the topological information content of molecule 5c is: 


Tse = — (2 loge 2+2 logs 2+2 logs 3) | 


= 4 log, § +1 log: 5 (10) 


= 1.52 bit. J 
The change in topological information content in the reaction illustrated 
in Figure 5 is therefore 
Al sigs — tleat dsy) 
=1.52— (1.58+1) (alate) 
= — 1.06 bit . 
It is interesting to note the different signs of the change in topological in- 


formation content in the reactions illustrated in Figures 3, 4, and 5. 
The change in the topological information content may be positive or 


284 GEORGE KARREMAN 


negative, depending on the different relative orientation of the same two 
molecules at the moment of reaction, because they form a reaction product 
of different configuration or different order. 

As the next reaction we will consider the one illustrated in Figure 6. 
Here the molecule 6), which is the same as molecule 30, reacts with the 
molecule 6a, producing the molecule 6c. Above we have seen the topologi- 
cal information content of molecule 30 and, therefore, also that of molecule 
6, is 1 bit. In molecule 6a the points 1, 3, 4, and 6 areof degree two, each 
one being connected with one point of degree two and one point of degree 
three; the points 2 and 5 are of degree three, each one being connected 


1 6 1 6 
2 se Rage 3) 
3 4 3 4 
a 
7é 10 
rs 8 z 9 
FIGURE 6 


with two points of degree two and one point of degree three. The proba- 
bility of a randomly chosen point in Figure 6a to be of the first type is 2/3 
and the probability to be of the second type is 1/3. Hence we obtain for 
the topological information content of molecule 6a: 


Ta = — (2 logs 2+4 logs 4) 
(12) 
In molecule 6c the points 1, 6, 8, and 9 are of degree two, each one being 


connected with one point of degree two and with one point of degree three; 
the points 2, 5, 7, and 10 are of degree three, each one being connected 
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with one point of degree two and with two points of degree three; and the 
points 3 and 4 are of degree three, each one being connected with three 
points of degree three. The probability of a point in Figure 6c chosen at 
random to be of the first kind is 2/5, that to be of the second kind 2/5, 
and that to be of the third kind 1/5. Therefore the topological information 
content of molecule 6c is: 


‘ (13) 


= 1.52 bit. 


As before the topological information content of molecule 60 is 1 bit. 
Therefore the change in the topological information content for the reac- 
tion of the kind illustrated in Figure 6 is: 
Ale =lee— (leat Los) 

=1.52— (0.91+1) (14) 

= = 0.39 bit . J 
The difference between AJ; and AJ,, 

AlI;—AlI,= 0.13 bit , (15) 


might be considered as a quantitative measure of the difference in struc- 
tural specificity of the two molecules 3a and 6a for the molecule 30 (or 
what is the same, molecule 60). 


The author would like to acknowledge his great indebtedness to Profes- 
sor Rashevsky for offering him the opportunity to read his paper before 
it was published and for several useful criticisms. 
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The potential and current distribution in a nerve bundle is studied mathematically under 
various situations. Relations are derived expressing the effect of many fibers on the external 
potential, the value of the potential for a given nerve excitation pattern with and without the 
nerve sheath, the potential of a single fiber for a given outside potential pattern, and the effect 
of varying the frequency of alternating current stimulation. Results of the latter study are 
used to account for experimental deviations of the two-factor theory, and good agreement with 
the experimental results is found. 


The early work on peripheral nerve excitation was done on nerve 
bundles (cf. Erlanger and Gasser, 1937; Katz, 1939a). However, with the 
advent of research on single nerve fibers (Curtis and Cole, 1938; Hodgkin, 
1937; Hodgkin and Huxley, 1939; Kato, 1934; and Tasaki, 1953) many 
significant advances have been made due to the greater simplicity of ex- 
perimental material. However, certain problems can be solved only by 
studies of whole nerve fiber bundles. This would include problems of the 
effect of certain drugs on the entire bundle both im vitro and in vivo; 
measurements of the potentials of nerve fiber bundles 2m vzvo, particu- 
larly when the stimulus is from a sense organ, such as with the optic or 
auditory nerve bundle; and many other situations. For this reason it is 
important to know the effects of the external nerve bundle sheath and of 
the presence of more than a single fiber on the external potential, which 
arises from the excitation of many of the fibers. Conversely, we should 
also investigate what the excitation pattern of an individual fiber is if 
the external measured potential is known. Also, it may be of interest to 
know the distribution of the current as a function of its frequency. This 
paper attempts to supply the answers to some of these questions. 

First, the effect of the presence of more than one fiber on the external 
potential is found. Next, the value of this potential for a given nerve exci- 
tation pattern is derived. After this, consideration can be given to the 
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problem of what the potential on a single fiber is for a given outside po- 
tential. Then we shall discuss the external potential for a given nerve 
excitation pattern in experiments in which the sheath has been removed 
from the nerve bundle. Lastly, the effect of the frequency of alternating 
current.stimulation on the current distribution is examined. 

The effect of a number of fibers on the external potential. Instead of con- 
sidering each fiber as a separate entity in any nerve bundle, it may be 
more convenient to group the fibers into various families (such as A, B, C, 
etc., fibers). Each family of fibers will then be characterized by the num- 
ber of fibers in the family as well as by certain parameters which are ap- 
proximately the same for all of the members of the family. A question 
which immediately arises is how any potential in the nerve bundle varies 
with the number of fibers within each of the families. To answer this 
question we first note that within any family of fibers some of the fibers 
may be firing (i.e., are excited), while others are not. Thus, we will 
simplify the mathematics by assuming that any family is subdivided into 
two subgroups, one in which all of the fibers are active and the other in 
which none of the fibers is. Each of the subgroups will be treated as fami- 
lies within their own right, and the results will be simplified further by 
assuming that all of the fibers within a family are identical. 

The following assumptions, which are basically the same as those used 
by other workers in this field (cf. Hodgkin and Rushton, 1946; Davis in 
Lorente de N6, 1947; and Taylor, 1952), will be used throughout this pa- 
per (see Fig. 1). 

1. The geometry of the system is represented by a group of infinitely 
long parallel cylinders with conducting cores and surface membranes, a 
common external conducting layer, a sheath surrounding this layer, and 
an outer conducting layer, all with linear electrical characteristics. 

2. The potential is constant throughout the cores, the external fluid, 
and the outer fluid at any point along the length of the model. 

3. The cores, the external fluid, and the outer fluid are pure constant 
linear resistances. 

4. In the resting state of the nerve bundle with no outside current in- 
put, all of the currents are zero. 

5. The equivalent circuit of the sheath is a resistance and capacitance 
in parallel. 

6. The excited nerve acts in a manner similar to the theoretical model 
of A. L. Hodgkin and A. F. Huxley (1952). Thus, the membranes may be 
considered to be a parallel network of the membrane capacitance and 
each of the membrane ion resistances in series with its ion potential (cf. 
Fig. 1 of Hodgkin and Huxley, Joc. cit.). 
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Let: 


« be the distance in centimeters along the length of the cylinders 

t be the time in seconds 

v be the frequency in cycles/sec 

; be a subscript denoting a fiber of the jth family of fibers. Since all of the fibers 
within a family are assumed to be identical, therefore the potentials, currents, 
resistances, etc., of the fiber will be the same for all of the fibers within a 
family. 

f be the total number of families of fibers 

1 be the distance in centimeters between nodes 

n; be the number of fibers within the jth family of fibers 

d; be the diameter in centimeters of a fiber 

W; be the spike width in centimeters of a fiber 

Vo be the potential in volts of the outer fluid* 

V. be the potential in volts of the external fluid 

V.; be the potential in volts of the inner core of a fiber 

V; = Vo — V- be the potential in volts across the sheath 

Ving = Ve — Viz be the potential in volts across a fiber membrane 

V wa; be the potential in volts across a fiber membrane due to Nat ion minus the 
resting state potential 

Vx; be the potential in volts across a fiber membrane due to K+ ion minus the 
resting state potential 

Vi; be the potential in volts across a fiber membrane due to Cl and other ions 
minus the resting state potential 

ro be the resistance in ohm/cm per cm of the outer fluid 

r. be the resistance in ohm/cm per cm of the external fluid 

ri; be the ee in ohm/cm per cm of the inner core of a fiber 


R, -(44+> Ye)” be the effective total longitudinal resistance in ohm/cm 


per cm intitle sd the sheath 

r, be the resistance in ohm cm of one cm length of sheath 

hs; = V7r-/(7o + Re) be the space constant in cm of the sheath 

mj be the resistance in ohm cm of one cm length of membrane of a fiber 

raj be the resistance in ohm cm of one cm length of a fiber due to Nat ion 

rx; be the resistance in ohm cm of one cm length of a fiber due to K* ion 

ry; be the resistance in ohm cm of one cm length of a fiber due to Cl" and other 
ions 

Rn be the resistance in ohm of a node 

Zm be the complex impedance of a node 

Nm = V Rn /l(nre + 7:) be proportional to the space constant of a fiber 

c, be the capacity in farads/cm per unit length of sheath 
* All of the potentials (except those due to the Na*, K*, CI, and other ions) are the differ- 


ence between the actual potential and the potential found for the resting state of the nerve 
bundle with no current input, i.e., they are the displacements from the resting state potential 
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Cm; be the capacity in farads/cm per unit length of membrane of a fiber 

C,, be the capacity in farads of a node 

tT, = 1sC, be the time constant in sec of the sheath 

tm = RmCm be the time constant in sec of a node 

i, be the current in amperes/cm per cm length flowing radially into the outer 
fluid from an outside electrode 

I, be the total current in amperes through the electrodes 

ip be the longitudinal current in amperes flowing in the outer fluid 

i, be the longitudinal current in amperes flowing in the external fluid 

i;; be the longitudinal current in amperes flowing in the inner core of a fiber 


OUTER FLUID 


EXTERNAL (INTERSTITIAL) FLUID 


CORE OF A FIBER OF THE jth 
FAMILY OF FIBERS 
MEMBRANE 


SHEATH 


Ficure 1. Geometry of the model considered. Section of infinitely long cylinder. For 
nomenclature, see text. 


i, be the transverse current in amperes/cm per cm flowing inwardly across the 
sheath 


im; be the transverse current in amperes/cm per cm flowing inwardly across the 
membrane of a fiber 


I, be the transverse current in amperes flowing inwardly across a node 


A simple application of Ohm’s and Kirchofi’s laws to the model of Fig- 
ure 1 yields: 


Op . 

a ere (1) 
di, Sed 

Fate Dy Minis (2) 

7=i 

dig. 

ax =tmj; (3) 
Toto = ers 

040 ax ; (4) 
‘a= 2M 

e“e Ox : (5) 
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Y ijlij ysis sae dee (6) 
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(8) 


Differentiating equations (4), (5), and (6) with respect to x, substituting 
into (1), (2), and (3), and then using (7) and (8) we have 


oh Vo 
Ox? 


di Va—.) . : 
7 ) rip, (9) 
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rs 


Ox? = 
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Tf we consider the unknowns to be Vo, V., and Vm;, we see that there are 
f +2 unknowns and f + 2 equations to solve for them. Thus the above 
equations with their suitable boundary conditions completely determine 
the potentials of the system. 

We may rewrite equations (10) and (11) in a slightly different form, 
namely, 


f 
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Hence, if the potentials are found as functions of the nerve parameters 
for the case of m; = 1 for all] (i.e., the families of fibers are treated as if 
they each possessed only one fiber), then the correct solution may be 
found by replacing ¢mj, 7ij; Tvai, ’xi, 71; (and, therefore, mi) DY 1slmi; 
1;/Niy trail Mi, 1x5/Mj, 11;/M; (aNd Tmj/M;). This conclusion is physically 
obvious from assumptions 2 and 3, but it appeared worth while to have 
this fact set down some place in the literature. The conclusion above is 
implicitly used in the derivations of Davis in Lorente de N6 (Joc. cit.). 

It should be emphasized that no assumption need be made as to the 
values of Cmj, "aj, 1Kj 11j) ANd mz aS functions of x or t. Thus the above 
conclusion is valid for myelinated and non-myelinated nerves, and re- 
gardless of how the membrane parameters change during excitation. Also, 
if we assume a nerve membrane model in which c,,; and 7m; are connected 
in parallel, instead of using the Hodgkin and Huxley model, the same re- 
sult as shown above would have been found. 

The potential outside of the sheath. In a nerve bundle, the potential which 
is actually measured during activity is not that of the single nerve fiber 
but is instead a sum of potentials from a number of fibers. In many of the 
present day experiments, the epineural sheath of the nerve trunk is re- 
moved so that a more accurate representation of the potentials can be 
measured. However, in certain cases—im vivo experiments; extreme diffi- 
culty of removing the sheath—it is necessary to make the measurements 
with the sheath remaining. For many cases it is very useful to know how 
much distortion is introduced by the sheath and, on the other hand, when 
the effect of the sheath may be ignored. 

We shall now attempt to treat this problem by using the assumptions 
and equations of the preceding section. Because of the mathematical diffi- 
culties involved, we shall first assume that if a fiber is ot excited it may 
be considered to have no effect whatsoever on the potential distribution. 
(This will be taken up again at a later point.) Thus, the symbol f will 
represent the number of families of fibers with at least one active fiber 
and ; will represent the number of active fibers within the jth family of 
fibers. We shall now find Vo in terms of V,,; and of the parameters of the 
bundle by solving equations (1) through (7) with, of course, ip = 0. Since 
we will not need (8), we see that the solution will be independent of the 
properties of the membrane. 

If we differentiate (4), (5), and (6), substitute them into (1), (2), and 
(3), add them up in the obvious manner so that the currents are eliminated 
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from the equations, and then integrate twice, we find, remembering that 
the potentials are zero at x = + -, 


V V. ln; ., 
Taig eas (14) 


Y oj 


Some 
7=1 


The other equation which we shall use is (9) with i, = 0, which was de- 
rived from equations (1) and (7). 

In order to solve these equations, we take the Fourier transform over x 
and then the Laplace transform over ¢ of (9) and (14) (cf. Sneddon, 1951; 
Churchill, 1944, for discussion of the Fourier and Laplace transform; and 
Taylor, loc. cit., for an example of their application to a similar problem). 
Eliminate the transform of V. between the two equations. Then solve 
for the transform of Vo in terms of the Fourier transform variable, the 
Laplace transform variable, the transform of V,, and the parameters of 
the system. Then taking the inverse transforms by use of formulas 444, 
202, and 820 of G. A. Campbell and R. M. Foster (1942) we find 


hae 


ods a Sofi |e RS i 
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Al gi Sore! 5 z R, : p whe U 
x Vg (2 0) | e [ al ) Mag 42 OE ff i (15) 
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Vo(x, t) = 


go lltlel+lr,(e—2) *\/larrg]} 


7/2 Vng(#) t— r)drdzt. 


We thus see that Vo is made up of three terms. The first term involves 
the initial value of Vo, the second involves the initial value, while the 
third term involves the further values of the V,,,’s. The contribution to Vo 
through the V,,.’s from the second and third term is simply the sum of the 
contributions from each separate fiber, the whole being modified by the 
value of R., which is a function of the number of active fibers. Since the 
external resistance, 7, is usually much less than the total of the internal 
resistances in parallel, R, is approximately given by 7, alone and thus the 
value of R, is fairly independent of the number of, excited fibers. It should 
be emphasized, however, that the contribution to Vo for each V,, is not 
simply a replica of the V,, reduced in amplitude by an amount depending 
on the parameters of the system. A distortion may be introduced, which 
is different for each type of fiber (see Fig. 2 and following discussion). 

In order to be able to analyze equation (15) more fully, it will be neces- 
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sary to consider it for a specific case. To do this we shall use values of rs 
and c, found by R. E. Taylor (1950) on frog sciatic nerve, namely, r. ~ 10° 
ohm cm and c, ~ .02 puf./cm for a sheath of 700 uw diameter. Thus, the 
time constant 7, ~ .02 msec. It should be emphasized that the product 
rc, is independent of the dimension of the nerve bundle; hence if the 
sheath material is the same for different nerves, 7, will remain the same 


-3 ye) -1 0 1 7 


3 4 5 6 7, 
oe 

Ficure 2. The outside potential for different values of \;/a graphed according to equations 
(16) and (17). The solid line represents V,,/.54Aa? and also Vo/1.08ABa?\, for \./a < 1; 
the dotted line represents Vo/.48ABa* for \./a = 3; the long-dashed line represents Vo/ 
.81A Ba? for \./a = 1; the short-dashed line represents Vo/1.16A Ba’ for \;/a = 2; and the 
dash-dot line represents Vo/2A Ba? for A;/a > 1. 


also. Since we are usually considering times of the order of 1 msec from 
the time of excitation, therefore, e~/’* is of the order of 10-”°, and the 
first portion of the right-hand side of (15) may be ignored. That is, due 
to the very short time constant of the sheath, the effect of the initial 
conditions is negligible when the action potential reaches the point where 
the measuring electrodes are. For the second portion of the right-hand 
side of (15), a further discussion is necessary. 

By setting the derivative of the function [excluding V,,;(z, # — r)] un- 
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der the double integral on the right-hand side of (15) equal to zero, it can 
be easily shown that the time at which this function is a maximum is ap- 
proximately equal to 3(7./.) |x — z|. Also, for twice this time or greater, 
this function is completely negligible compared to its value at the maxi- 
mum time. As shown in the next paragraph, the maximum effective value 
for |x — z| is of the order of 5\,. Thus, the maximum time is at most 
(5/2)7;, which for the above values is of the order of .05 msec. The short- 
est spike duration is of the order of .4 msec (Grundfest, 1952), and the 
maximum change in the potential occurs in the order of 1/3 of this time, 
which is about .1 msec. Hence, with a good approximation (the longer 
the duration of the spike, the better the approximation), we may consider 
that in the integration over time in (15) the function under the integral 
is not negligible only for a time such that the V,,;(z, # — 7) can be con- 
sidered as approximately constant and can thus be taken outside of the 
integral (with 7 set equal to zero). That is, due to the very short time 
constant of the sheath, we can ignore the time dependency due to the 
capacitance of the sheath. By use of formula 822 of Campbell and Foster 
(Joc. cit.) and using the approximations discussed above, we see that equa- 
tion (15) reduces to 


tals Nj 


ls 7=1 Lv i3 


+c 
Vila.) =5 aE ge etd aus (16) 


In the above equation, the “important” contribution to the integral 
occurs within the range where the exponent is larger than approximately 
(—5). Thus, we see that the maximum effective value for |“ — 2] is of 
the order of 5\,, which was the value used in the argument of the preced- 
ing paragraph. 

In order to more readily observe the effect of the integral in (16), a 
particular example will be used. Choose a bundle of fibers composed of 
identical fibers (f = 1). Since V,,(x, ¢) is a traveling wave, we need only 
focus our attention at a particular arbitrary ¢ and express V,,(«, ¢) as 


V m(u) = Ax? e/a (ez 0) } (17) 
=( Cr Oe 


We could have used the equations for V,, as put forward by A. Rosen- 
blueth, et al. (1948), but since we are only interested in qualitative re- 
sults, (17) will suffice for our purposes. If we let B = [(270R.A.)/r.]n/ri, 
and £ = «/a, we may graph V» and V,, vs. & for different ratios of \./a 
as in Figure 2. 
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It can easily be seen from Figure 2 that as ), increases in value the 
curve of Vo for a given V» tends to broaden out in space while as ), de- 
creases in value the curve of V» tends to become more similar in shape to 
Vin. The amplitude of Vo depends upon the parameters of the system, in- 
creasing-with an increase of ro or R, and decreasing with an increase of 
r, or r;. (This is in agreement with the well known usage of blotting the 
nerve bundle so as to increase 79, a result which not only increases the 
amplitude of Vo but, due to the decrease of X,, increases its reliability as 
a measure of V,,.) To find the value of \; for which the shapes of V, and 
V> are similar we note that in Figure 2 the effective width of the spike is 
approximately equal to = 6. If this is of the order of W cm experimen- 
tally, we see that a ~ W/6. As Vo is similar in shape to V,, for A, < a/2, 
therefore, for \, less than or of the order of W/12, Vo will be similar in 
shape to Vm. Mathematically, this means that for \, < W/12, Vm may 
be taken outside of the integral in (16). Thus, if 7./(7>o + Re) < W?/150, 


nN 1 
a r) Vm (18) 
T 0 R, 


For mammalian nerve, W varies between 0.1 cm and 8 cm (calculated 
from values quoted by Grundfest, loc. cit.). 

As stated above, Vo is simply the sum of the contributions from each 
separate fiber. We may, therefore, discuss the effect of a given fiber di- 
ameter on the portion of Vo due to the contribution of that given fiber. 
Since 7., 79, and R, are the same for all of the fibers, the total effect of the 
fiber diameter is then due to 7;; and Vn; (7:; is inversely proportional to 
the cross-sectional area and is, therefore, proportional to 1/d;). As shown 
by H. S. Gasser and H. Grundfest (1939), the spike width for a given 
type of fiber is proportional to the velocity, which in turn is proportional 
to the diameter of the fiber. Thus the spike width of Vm; is proportional 
to d;. As Vm; appears under the sign of the integral in (16), therefore, de- 
pending on the value of X,, the integral will vary as d?, where 0 < a < 1, 
if we assume that the maximum value of V»; is the same for all fibers. 
This may also be seen from Figure 2 where a is now proportional to d; 
and Aa is a constant and equal to 1/.54 times the maximum amplitude 
of Vj; Thus, depending on the other parameters, Vo varies as the bth 
power of the fiber diameter d;, where 2 < b < 3. Therefore, like Rushton 
(1951), we do not understand why it is found experimentally that the 
amplitude of V is proportional to the first power of d; (cf. Gasser and 
Grundfest, loc. cit.). 
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Let us now consider what the total area under the outside potential 
curve is) 1.e. wir Vo(x, ¢)dx. In order to do this, it will be most con- 


venient to first take the Laplace transform over ¢ of equation (15) by use 
of formulas 202, 444, and 820 of Campbell and Foster (Joc. cit.), integrate 
over x and then take the inverse transform by use of formula 438 of the 
above reference. Doing this, we find: 


f- Vole, as = ane Lae QO) ks ye 


: : fobouso nN; 
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(19) 


If we consider only times much greater than 7,, the above equation re- 
duces to 


eg  7oRe Wn; rt? 
fi, Vote, el Pasa Vnj(#)t dz. (20) 


Note that the above expression is independent of the parameters of the 
sheath. 

It is customary to assume that except for the initial transients the 
Vnjs are traveling waves (cf. Hodgkin, 1954), which implicitly assumes 
that the membrane voltage is independent of the condition of the other 


+o 
fibers about it. Thus, we see that the He V (x, t)dx is a constant, in- 


dependent of time. That is, as the spike potentials from the individual 
fibers spread out in space due to different conduction velocities of the dif- 
ferent fibers, the total area underneath the outside potential curve will 
remain constant. This is in agreement with experimental observations 
(Erlanger and Gasser, Joc. cit., p. 14). 

We shall now consider the effect of those fibers which are not excited 
upon the measured value of Vo. The difficulty involved in solving this 
problem is purely a mathematical one. Since for a non-firing fiber we do 
not a priori know the value of its Vn; (as we do for the excited fiber), we 
must, therefore, attempt to solve for it, which is difficult. However, if we 
assume that the external resistance is much less than the effective internal 


resistance of the non-firing fibers, i.e., 7. 1/ > (n;/r:3), where 1; and 7;; 
i 


refer to the non-firing fibers, then it is physically obvious that the as- 
sumption made earlier in this section of ignoring the effect of the non-firing 


fibers is valid. 
The potential across a fiber for a given outside potential. In the previous 
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section the problem of what V» would be, given the form of the V's, was 
discussed. In many cases, however, the opposite problem is of importance, 
namely, for a given Vo, what is the V,, which produced it? It is obvious 
that if there is more than one type of fiber present in the bundle this ques- 
tion cannot be answered on the basis of knowledge of Vo and the parame- 
ters of the bundle alone. For this reason we shall assume that there is 
only one type of fiber which is excited and, for the same mathematical 
reasons as discussed in the previous section, it will be assumed that the 
effect of the non-firing fibers is negligible. 

For this case, equations (9), with 7, = 0, and (14) will be used. Solving 
equation (9) for V. and then substituting it into (14), we have, after ele- 
mentary manipulations, 


= ’ aE Lae F my 1 1 7 —i/r, 
J eee) =") [Fv (x, 0) (—+3)Fo(x, 0) Je 


rs *0?Vo(%, 7) _-(-+)/, S Se oe 
f ° Fee a+z)Volxs Of. 


R.roTs Ox? 


(21) 


Applying (21) to a specific example, let us consider the same values 
for 7, as in the previous section. By similar arguments we see that to a 
good approximation the term involving e~/” may be ignored and that 
0°V)/ dx? may be taken outside the sign of the integral. Thus, equation 
(21) reduces to 


- rj s OVo(x 
Vm (x, #) ~T\(L +E) Ve (x,t) — a : a = t (22) 

It is, of course, expected that c,, the integral over time, and the initial 
conditions should drop out of the equation, since under the same assump- 
tions they did so for (16). No general qualitative statements can be made 
about equation (22), due not only to lack of knowledge of the parameters, 
which may vary tremendously between different experiments, but also 
because 0°V/ dx* can be much larger or much smaller than V» depend- 
ing on whether the spike width is small or large. 

The external potential of a nerve bundle without a sheath. In previous sec- 
tions the problem of the potential at various points was discussed for the 
case of a nerve bundle with intact sheath. However, as mentioned before 
there are many experiments in which the sheath is removed and the ac 
tential of either a single fiber or a group of fibers is measured. We shall. as 
before, assume that the effect of the non-excited fibers is negligible. Not- 
ing that for this case of no sheath, Vo is equal to V., and that the effective 
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external resistance is now 1/[(1/r.) + (1/ro)], which we may for sim- 
plicity call r., then from (14) we see that 
— 
Vs (x,t) =R, >> 7M mi (#1) 


J=1 


f 
nN; 
rey V mi (0) | (23) 
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It should be mentioned that (23) for the case of one fiber has been used 
in the literature for many years and thus the equation above is only a 
slight extension of a well known relation. 

The following features of equation (23) should be emphasized. First, 
the only assumptions made are those concerning the non-excited fibers 
and the constancy of r;; and r,. Thus the equation holds regardless of the 
type of fibers involved or their mode of excitation. A second point is that 
(23) may be derived from (16) for a small enough 7,, (16) in turn being 
derived from (15) for a sufficiently small time constant, 7,c,. The value 
of r, which is needed so that the effect of the sheath may be ignored was 
found in that section by letting /7,/(7> + R.) be much less than the 
width of the spike. Thus the value of 7, which is needed is not a certain 
fixed quantity but depends on all of the other chemical and physical 
parameters of the nerve bundle. This point is mentioned here in the hope 
that it may be of some help in discussing the importance of the nerve 
sheath in electrical measurements in experiments done on excised nerve 
(cf. Lorente de N6, 1950, 1952; Shanes, 1954; Rashbass and Rushton, 
1939; and Taylor, 1950). A third feature of (23) is that if there is only one 
type of fiber in the nerve bundle, then it reduces to 


Te 


VEG Se alan eee ° (24) 
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which is the same as equation (18) with 7, as the effective external resist- 
ance. Note that V.(x, ¢) and V(x, ¢) are proportional to each other, the 
coefficient of proportionality being independent of position and time. A 
graph of (24) is shown in Figure 3. From the graph we see that for small 
n, V. increases linearly with , a result which was also true for the case 
of Vo [see (15) and the discussion following it]. For large m, i.e., large 


300 CLIFFORD S. PATLAK 


compared to 7;/7-, Ve approaches a plateau level equal to Vn. For the case 
of Vo, since as increases R, approaches zero and, therefore, (ro + R.) 
approaches 7», we see that in equation (15) Vo also approaches a plateau 
as m becomes large. (This plateau level, however, is not equal to Vm but 
is determined by the #, cs, 7s, 70, and the shape of Vn ;.) Thus we see that 
the approximation obtained by simply adding up the potentials due to 
each nerve depends on the value of 7,/r.; the greater this value, the more 
valid is the simple summation. This approximation will probably suffice 
for most cases (cf. Erlanger and Gasser, Joc. cit., and Gasser and Grundfest, 
loc. cit.) but for some nerves the non-linearity factor may be of impor- 
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Frcure 3. The ratio of Ve/x, t)/Wm(«, ¢) vs. the number of active fibers times r./r; graphed 


according to equation (24) for the case of only one type of fiber in a nerve bundle with no 
outside sheath. 


tance. For example, in the experiments of S. Bryant (Personal communi- 
cation, 1954) on Carcinus nerve the evoked potential V» did not change 
(i.e., a plateau had been reached) as the excitation potential was increased. 
However, by means of light scattering observations it was seen that more 
fibers were excited by the increased excitation potential. This might pos- 
sibly be explained by the results above. 

Effect of the frequency of alternating current stimulation on current dis- 
tribution. If a peripheral nerve is stimulated by alternating current, it is 
found (Katz, 1939b) that the threshold current for excitation aumied 
from that predicted by the two-factor theory of N. Rashevsky (1948) and 
A. V. Hill (1936). This deviation increased as the frequency increased 
reaching, for the stimulating electrodes far apart, an asymptote of See 
half of the theoretical value. A possible explanation of this deviation (i.e., 
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due to the rectifying properties of the membrane) was worked out quanti- 
tatively by the present author and H. D. Landahl (1949) but the results 
were not in good agreement with those found experimentally. 

Although the two-factor theory of excitation has in a sense been super- 
seded by the theory of Hodgkin and Huxley (oc. cit.) it is still of interest 
to study the older theory and to attempt to account for any major devia- 
tions of this theory from experimental results. This is because the Hodg- 
kin-Huxley theory is mathematically so complex that as yet it has not 
been shown that it will yield the correct results for those experiments 
which the two-factor theory will account for. Also, if the experimental de- 
viation of the results from those predicted by the older theory can be ac- 
counted for, then the new theory and any other which may subsequently 
be proposed will merely have to be shown to be approximately equivalent 
to the two-factor theory in order to account for a large amount of experi- 
mental data. Since the two-factor theory has been so successful for many 
different experimental situations, it would seem probable that the theory 
is good. Therefore, if any discrepancy between the theory and experiment 
can be explained by a plausible additional hypothesis, then such an addi- 
tional hypothesis may be applied to any newer theory. For example, if the 
rectifying properties of the membrane had proven successful in explaining 
the above discrepancies, then either the new theory would incorporate 
these properties directly, or the additional assumption of those rectifying 
properties will have to be made in the newer theory. In the explanation 
which is to be offered in the following paragraphs, the latter will probably 
apply. 

When a current is passed across a nerve, excitation is caused by the 
current which crosses the nerve membrane at a node (Rushton, 1927; 
Stampfli, 1954). It has been tacitly assumed that the amount of current 
which crosses a node is a certain fraction of the total input current through 
the electrodes, this fraction being independent of any changes in the time 
course of the current. However, since the nerve bundle sheath and the 
nerve membrane have capacitive elements within them, therefore, in the 
case of alternating current, the fraction of the current which crosses the 
node will xot be a constant but will depend on the frequency. Thus, if the 
present theory is correct, the ratio of the threshold current predicted by 
the two-factor theory to that measured experimentally should be equal to 
the ratio of the current passing across the node at a given frequency to 
that passing across it at zero frequency. (That this factor might be of 
some importance has been suggested by Hill e¢ al., 1936, p. 120.) 

We may assume that the sheath and membrane may be each repre- 
sented by a capacitance and resistance in parallel, with zero inductance; 
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that both the resistance and capacitance remain constant and the capaci- 
tance has a constant phase angle of 90° as the frequency changes (cf. Katz, 
1939a, for a review of the literature on this subject). That the latter two 
assumptions are not completely correct has been known for some time, 
but they are introduced so that quantitative work may be carried out. 
Thus, the following will happen qualitatively as the frequency increases: 

Since the impedance of the sheath and of the nerve membrane decreases 
with increasing frequency, more of the current will pass through the sheath 
directly underneath the electrode, i.e., the effective space constant of the 
sheath will decrease with increasing frequency. Also, due to the lowering 
of the impedance of the membrane, more of the current will pass through 
the node nearest the stimulating electrode as the frequency increases, 
instead of by-passing this node to enter the nerve fiber via the membrane 
“upstream.’’ That is, the effective space constant of the membrane will 
also decrease. As the frequency further increases, the fraction of the cur- 
rent passing through the node reaches an asymptotic value due to the in- 
ternal resistance of the nerve fiber. This value for the case of one electrode 
directly above the node and the other electrode at least one node distant 
from it is easily seen to be r./(mr. + 7;). If we further assume that the 
time constant of the myelin is less than the time constant of the node, 
then at first the current through the node will increase but as the fre- 
quency increases further the myelin impedance starts to decrease appre- 
ciably and some of the current may by-pass the node and flow through 
the myelin around it. Thus, due to the action of both the sheath and the 
node impedance, there will at first be an increase of the current through 
the node, but as this increase approaches its asymptotic value for the case 
of zero frequency myelin impedance there may actually be a decrease due 
to the decrease of the myelin impedance. Hence the current through the 
node first increases to a maximum, then decreases to its asymptotic value. 
If the electrodes are placed closer together, then, as will be shown for the 
case of the two electrodes being directly above nodes [eq. (27)], the frac- 
tion of the current through the node at a frequency equal to zero is less 
than if the electrodes are farther apart. This effect is most pronounced for 
distances of the order of an internodal length. Since for high frequencies 
the asymptotic value of the fraction of the current through the node will 
be the same regardless of the distance between the electrodes, providing 
that they are more than one internode apart, we would expect the fraction 
of the current through the node, for a given frequency, to be larger rela- 
tive to the zero frequency fraction as the electrodes are placed closer 
together. If we assume that the membrane can be represented by a resist- 
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ance in series with a parallel circuit of resistance and capacitance, the same 
qualitative result for the effect of the distance between electrodes will be 
found. All of the above qualitative results are in agreement with those 
found experimentally by Katz (1939b). 

For a quantitative discussion of the above theory, the standard circuit 
analysis for alternating current will be used (cf. Page and Adams, 1931). 
In this treatment the impedance at any point is lumped together as a 
complex impedance and the equations are then solved. Then the absolute 
value of the resultant required current or voltage is found. In the analysis 
of the above theory, the final equation for the complex current through 
the node is so lengthy as to require about two-thirds of this page just to 
be written out. Hence, as expected, the absolute value of the current 
through the node would be so lengthy as to make it unmanageable. Also, 
in any but the simplest cases, the number of arbitrary parameters is too 
large to give a useful result. Thus, only the simplest case will be treated. 

We shall assume for simplicity that all of the fibers are identical, 1.e., 
f = 1, the sheath has a negligible resistance while the myelin has an in- 
finite impedance, and that the nodes are of infinitesimal length. We shall 
not assume that the external resistance is negligible compared to the effec- 
tive parallel internal resistances of all of the fibers and thus we shall use a 
slightly different approach from that of J. J. Lussier and W. A. H. Rush- 
ton (1952). Let J,,.(k) represent the total current through the Ath node, 
Zm be the impedance of a node, and / be the distance between nodes, 
which is, of course, assumed to be constant. Further assume that one elec- 
trode is at the origin, which is located at the zeroth node, and the other 
electrode is at infinity, both electrodes being of infinitesimal thickness. 
Then in a manner completely analogous to that of Taylor [1952, eqs. 
(41)—(50)] we find that 


Im (+1) — [2+ hte F PO] 7, ce) +14 (b= 1) = — FF i9(#) . (25) 


Due to the assumptions about the electrodes, 7,(k) is zero for k # 0 and 
is equal to the total current through the electrodes, 7p, for k = 0. Solving 
this equation by the standard methods (cf. Boole, 1946), we find J m(R) 
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Thus, if the second electrode, instead of being placed at infinity, is placed 
at the hth node, then by the superposition theorem we find 
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Thus, we see that J,,(0), for zero frequency, increases as the absolute value 
|z| increases. Taking the second electrode at infinity, we have 
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Denoting the total resistance of the node by R,,, the total capacitance of 
the node by C,, and the frequency by v, we have 
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If we wish the ratio to reach the asymptotic value 2, we see that \;, = 3/4. 
Thus, R,, = #/(mr. + 7;), a result which is quite plausible on the basis of 
experimental values for the frog sciatic nerve. The graph of (31) is shown 
in Figure 4, for 7, = 2 X 10-* sec, a value which is quite plausible. 
Comparing with the data obtained by taking the ratio of the theoretical 
threshold current computed from the two-factor theory to the experimen- 
tal threshold current from Table IV of Katz (1939b), we see that there is 
satisfactory agreement between theory and experiment. 

The following points should be mentioned. First, if the electrode is not 
placed directly above the node, the theoretical curve should not be ex- 
pected to vary its form by much. If the sheath and the myelin are intro- 
duced, then their effect on the curve of (31), as stated before in the quali- 
tative discussion, would be for the former to increase the rise of the curve, 
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as we would wish, and the latter to cause the curve to bend down at the 
upper frequency range. However, on the basis of the experimental values 
given by Staémpfili (Joc. cit.), the time constant of the node is less than that 
of the myelin membrane. Thus, it appears that the dip in the experimental 
curve may not be accounted for by this theory, although no definite state- 
ment about this fact can be made due to the difficulty of knowing what the 
actual values of the time constants were in the experimental set-up of 
Katz (1939b). It should also be emphasized that the “dip” in the experi- 
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Ficure 4. The dotted line represents the ratio of |Im(O)v=» /Im(0)y=0|vs. frequency, 
graphed according to equation (31). The solid line represents the ratio of the theoretical 
threshold current computed from the two-factor theory to that found experimentally vs. fre- 
quency. The circled points are computed from Table IV of Katz (1939b). 


mental curve is based solely on two experimental points which indicate a 
decrease of the current from its maximum value by only 2% and thus 
may not really represent a true decrease. A second point which should be 
emphasized is that the above equation was derived under the assumption 
that the total current through the node is the important factor in excita- 
tion. If, however, we had taken the voltage across the node as the factor 
of importance (as seems to be implicit in the Hodgkin-Huxley theory), 
then the ratio | Vin(0),-,/Vm(0),-0| would decrease as the frequency in- 
creases, a result which is qualitatively opposite to what is desired. Also, 
since there is no current through the C,, for zero frequency, and since the 
current through R,, is directly proportional to Vm, we see that if the above 
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explanation is correct then the basic factor in excitation is the total current 
through the node. Finally, it should be stated that the general explana- 
tion offered in this section should apply equally well to any excitation in 
which the time course is altered, such as condenser discharges, etc. 


The author wishes to express his gratitude to Professor H. D, Landahl 
for his many helpful discussions and suggestions. 
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The optimal systems approach to the muscular system leads to difficulties since the prop- 
erties of the muscular system are determined to a great extent by the nature of the contractile 
unit or molecule. This unit has determined the morphology and dynamic characteristics of 
muscle, and only smaller order alterations are then possible to adapt muscle to its several func- 
tions. 

A model of the contractile unit is developed that shows agreement with experimental find- 
ings with respect to the velocity-load relation, heat effects, and several aspects of knowledge 
of the structure of the contractile proteins. 


In the first two papers of this series (Cohn, 1954; 1955), emphasis 
has been placed on mathematical aspects of the optimal system approach 
to a problem of biological importance, the form of the vascular tree. 
Physicomathematical methods could be applied in detail to this prob- 
lem only because we could analyze the vascular tree form in terms of 
a simple condition known to be of survival value to the organism. This 
condition is that the vascular tree shall have a minimum resistance to 
fluid flow. We were able to find the conditions which minimize the flow 
resistance of a branching system of vessels such as the vascular tree by 
well-known mathematical methods. From such basic considerations a 
system was then constructed very similar in form to the actual vascular 
tree. 

In the above papers emphasis was placed on engineering aspects of the 
constructed system. There was also a discussion of the biological signifi- 
cance of organic system construction based on engineering principles. It 
was stated that, in the case of a highly evolved organ system, the system 
actually developed will probably have greater survival value for the or- 
ganism. This fact may then be used when considering why the system de- 
veloped the way it did. That is, we first determine the factors influencing 
survival value and then construct a form utilizing available materials to 
perform the required function, a procedure that was discussed in great 
detail in the first paper of this'series. 
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The choice of the vascular system was fortuitous in that survival value 
could be correlated with a well-defined physical concept, the resistance of 
the system to fluid flow. Then the biological problem of maximizing sur- 
vival value was analogous to the mathematical problem of minimizing the 
flow resistance of the system. The success of the analysis applied to this 
one simple example gives us hope that we have come a little closer to stat- 
ing in mathematical form one of the well-known basic laws of life—the 
survival-of the fittest. This in turn is an important determiner of the na- 
ture of organic form. 

The approach described above may, we believe, be successfully ex- 
tended to organic systems other than the vascular tree. However, it was 
not our intention to convey the idea that all systems could be so simply 
analyzed in terms of a single explicit factor (flow resistance in the above 
example). The present paper will be an extension of the “‘optimal sys- 
tems” approach to a more complex problem. 

Implicit in the solution of the problem of the form of the vascular tree 
were limiting conditions placed on certain factors that might affect the 
final solution. For example, we assumed the physical characteristics of the 
blood could be treated as constant, and the form of the vascular tree 
could be altered independently of these. However, the physical charac- 
teristics of the blood also placed limits on possible variations of the form 
of the vessels. An obvious extension of the method is then the considera- 
tion of several factors affecting the survival value of the organ system. 

One problem that will illustrate the intended broad approach of the 
optimal systems analysis is that of general considerations on muscle. The 
mammalian muscular system is very complex, and represents a large part 
of the mass of the mammalian body. Considerations on the ‘optimal’ 
aspects of the muscle system immediately lead the reader to expect specu- 
lations on why we have as many muscles as we do, the economy of the 
mass of the muscular system relative to the rest of the body, etc. Impor- 
tant as these questions are, there is one basic consideration whose under- 
standing is necessary for an answer to the type of question mentioned 
above. This is an understanding of the nature of the contractile mecha- 
nism itself. Only when we understand this can we speculate concerning its 
possible variations to suit different functions in the body. However, be- 
fore considering the contractile mechanism in detail we shall briefly sum- 
marize the many possible factors affecting the form of the muscular sys- 
tem. The following is one list of possible regions of variation for the type 
of factor that we will consider. With each region examples of possible 
variations within the region have been listed as illustrations. 
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Intra-cellular factors: 


1. Basic chemistry of the cell. The possibility of contractile tissue exist- 
ing as we know it is due to the fact that the main structural material of 
plants, cellulose, is replaced by the more labile proteins of the keratin 
group in the animal cell. 

2. The specific biochemistry of muscle. The actin-myosin protein sys- 
tem is very specific and does not allow for any variation among organisms. 
The metabolic phosphate transport system involving creatine in the verte- 
brates and arginine in the invertebrates illustrates adaptation to different 
modes of life. 

3. The organization of the muscle cell. Localization of function is the 
most important phenomenon, which is particularly important in the dif- 
ferences between smooth and striated muscle. This is an adaptation to a 
speeding up of the molecular machinery of muscle. 


Inter-cellular factors: 

4. The organization of cells to form muscle. This is illustrated by the 
difference between skeletal striated muscle (in general parallel muscle 
cells), and cardiac muscle (a syncytium of cells). 

5. The organization of muscles within the body to form the complete 
muscular system. Under this heading we would consider partitioning of 
different relative amounts of muscle to different functions in different 
organisms. 

6. The determination of the form of the entire organism by the above 
factors. 

The intra-cellular factors will be treated in this paper. It is planned to 
treat the inter-cellular factors in a future paper. The following is an ex- 
ample of the type of treatment planned. This example relates to the last 
of the factors mentioned above. 

Determination of the optimum size of an organism considering the limita- 
tions placed on it by its muscular system. 

Given an organism of mass M, and specific gravity o; then the volume 
of the organism will be M/c. A typical linear dimension Lr we define as 
a length such as an average diameter for a simple organism, or the length 
of an appendage or trunk, etc., for a more complex organism. Then 


ageets (1) 


is a typical linear dimension. 
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We desire the contractile system to move the body of mass M through 
a distance Lr. Thus the work, W, that we expect of the contractile system 


per contraction is 
W= MegL T 9 (2) 


in which g is the gravity constant. 
If we call m the mass of the contractile unit, then W-.,, the work we can 
expect of the contractile unit per unit weight per contraction, is 


=, (3) 


Let us define the contractile substance as the specific molecule or mole- 
cules able to contract and change linear dimension. Then not all the con- 
tractile unit will be contractile substance. Call the fraction of the con- 
tractile unit that is contractile substance y, and assume this fraction to be 
independent of body mass. Then we see that the work expected per unit 
weight of contractile substance is 


(4) 


Substituting (1) in (4) and noting that m is proportional to M we obtain 
Wes= KM . eS, 


Since the work obtainable from a single contractile unit is constant, the 
above equation implies an upper limit to the size of organisms if the rela- 
tive size of the muscle and the variety of movements are kept constant. 
To exceed this size would mean increased sluggishness or loss of variety 
of movement. This may help explain the fact that animals in the weight 
region of several hundred pounds are the most successful land predators 
(i.e., the lion and tiger and other large members of the cat family), as well 
as the fastest land animals. 

Another topic to be considered would be the possible modes of organiza- 
tion of cells to form muscle. Different length-tension diagrams are possible 
with different structural patterns. The increased possible tension of heart 
muscle at greater than rest length is an interesting adaptation of the nor- 
mal length-tension diagram of muscle (maximum tension at near rest 
length) to a situation in which the increased tension at great lengths is 
necessary. 

However, before considering these special situations it is necessary to 
consider the intra-cellular limitations placed on the muscular system. This 
will be done in the present paper where we will deal mainly with consider- 
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ations on the contractile system of muscle, which we will later utilize in 
further work on more general aspects of the muscular system (which ap- 
proach has been outlined above). The present work is then a necessary 
aside to the main theme of the series. 

Present state of theory construction concerning muscle. There is a large 
spectrum of experimental approaches to the study of muscle as well as 
many levels of theorizing concerning the nature of the contractile mecha- 
nism. In this paper we will be concerned with a theoretical approach to the 
subject. Naturally, the motivation as well as validation of any such ap- 
proach is dependent on the accumulated experimental data in the field. 
This data has been adequately reviewed recently and for this the reader is 
referred to M. Dubuisson (1954), A. Szent-Gyorgyi (1951), W. F. H. M. 
Mommaerts (1950), and D. R. Wilkie (1954). 

Even a casual study of the experimental data will convince the reader 
that a great deal of time and effort has been devoted to studies of the 
contractile mechanism. However, the uncertain state of knowledge and 
theorizing concerning the contractile mechanism is illustrated by the 
present conflict of views concerning very fundamental questions. For ex- 
ample, the still undecided question of whether contraction or relaxation is 
the ‘‘active” part of the contractile cycle; and the question of whether con- 
traction is entropic or due to internal changes. 

Considering the difference between the many experimental facts ob- 
tained and the lack of an adequate theoretical explanation of the physio- 
logical phenomena it might be useful to investigate the nature of theory 
construction in this field in the hope of throwing light on its inadequacies. 

Most theoretical attempts to explain muscle action have been based on 
some known molecular physicochemical mechanism. Then the observed 
action has been interpreted as a macroscopic manifestation of the micro- 
scopic phenomena. Wherever the model contradicted observed properties 
of muscle it was altered. However, the emphasis has been on the molecu- 
lar interpretation. This approach is typical of recent physicochemical ap- 
proaches to biological problems. 

In the specific case at hand we have many physicochemical observa- 
tions available. While a good theory must ultimately explain as many ex- 
perimental facts as possible, because of the complexity of the present pic- 
ture it is difficult to build a theory from some specific facts. Our efforts 
will be better directed if we consider the large-scale picture and the conse- 
quences of present knowledge concerning the general molecular structure 
of organic molecules; in the case of muscle particular emphasis will be 
placed on the protein molecule. 
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With the previous pages providing a background for our approach we 
will now give a detailed presentation. 

The general structure of striated muscle: 

Striated muscle is composed of many long cells held together by con- 
nective tissue. The cells are passively bound together and we may consid- 
er as the anatomical unit of function a single one of these cells. Thus we 
may limit ourselves at present to considerations of the single cell. The 
muscle cell must consist of the following components. 

1. A structure to keep the cell localized. This is necessary because of 
the fluid nature of most of the cell. This structure is the sarcolemma. Ana- 
tomically it helps maintain the integrity of the cell by inhibiting long ex- 
tensions. 

2. A matrix to support the functional components of the contractile 
system within the cell. This matrix will be fluid in nature. 

3. A metabolic system to supply energy. 

4, A contractile system which converts metabolic chemical potential 
energy into mechanical energy. 

5. A trigger mechanism to activate the cell. This is the mechanism of 
membrane stimulation and possibly the reactions of the muscle cell to 
varying amounts of tension. 

The contractile mechanism. Of the above listed components the one that 
most distinguishes the muscle cell from other cells is the contractile 
mechanism. Since we are not here concerned with theories of general cellu- 
lar function, we will limit our speculations to this contractile component 
of the muscle cell. 

There are in existence several kinetic theories of this unit. The most 
complete exposition is found in M. J. Polissar (1951) and essentially simi- 
lar ideas are investigated by F. Buchtal and E. Kaiser (1951). Their mod- 
els involve units capable of existing in two forms, which for convenience 
we may call long and short. A single muscle fiber is then considered to be 
made of many such units strung together in series. The kinetics for the 
transition of a single unit from the long to short and from the short to long 
form are considered and the state of the fiber is then obtained as a thermo- 
dynamic equilibrium of short and long forms. 

Subdivision of the contractile mechanism. Our model will start with one 
basic assumption similar to the above. This is that the contractile units 
can exist in one of two states, a long or short configuration. However, we 
assume that the fiber consists of another element which we call the trans- 
missive element. This transmissive element functions in a passive role and 
serves simply to maintain the contracted muscle in a certain state. 
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Justification for introducing idea of transmissive unit. Since the intro- 
duction of a new element into the muscle mechanism introduces a com- 
plexity not found in simpler models, we feel there should be an adequate 
justification for this step. The main justification comes from considera- 
tions on the dimensional changes possible to a contracting fiber. The 
stretched stimulated fiber may contract from 200% rest length to less 
than 60% rest length (Davson, 1952, p. 491). If the fiber were to consist 
of contractile units arranged in series, since the contraction of the fiber 
is the sum of the contractions of the units of which it is composed, a single 
unit must be able to reversibly shorten over a range of at least three times 
its shortest length. However, the best evidence available today concern- 
ing the structural chemistry of muscle proteins indicates a much smaller 
magnitude of reversible contraction [see transformations from pleated 
sheet configuration (8) to folded (a) chain of keratin-like proteins in Paul- 
ing and Corey (1951a, b) and Pauling, Corey, and Bramson (1951)]. 

A model of ihe contractile mechanism. We assume that the muscle con- 
tractive mechanism consists basically of two parts as outlined above: one, 
a so-called active unit, which shortens relatively little and which also pro- 
vides the energy of contraction; the other we call the passive unit (Fig. 1). 
The latter may be conceived as a long-chain molecule, which can fold into 
a much shorter one. Before the active unit contracts, it is assumed to be- 
come rigidly linked to the passive unit. Therefore its contraction results 
in a shortening of the passive unit also. We assume, moreover, that the 
rate of return of the active unit to its original length is much faster than 
the rate of return of the passive unit. During the return the two parts are 
assumed to be disconnected. We then have the following chain of events. 

When relaxed the muscle may be freely moved or stretched within lim- 
its imposed by the elasticity of the sarcolemma (Fig. 1A). Thus we as- 
sume very weak or no binding together of the component parts of the con- 
tractile mechanism. Since very little heat is generated, we also assume no 
chemical reactions other than the normal non-contractile metabolic re- 
actions. 

When stimulated, the following is postulated: 

1. Electrical and consequent ionic changes in and around the sarcolem- 
ma and possibly throughout the fiber induce a changed internal state of 
the fiber. 

2. The changed internal state of the fiber causes a binding together of 
the passive unit as well as a binding of the active to the passive units (Fig. 
1B; see below for details of these two processes). 

3. The active units are now able to contract, supplying mechanical 
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energy to the passive units (the source of this energy being the metabolic 
system; Fig. 1C). 

4, After the active unit has contracted, its binding to the passive unit 
is broken. Then the active unit returns to its long state very rapidly. Dur- 
ing this-time of return the passive unit has partly returned to its original 
long length by the tension existing in the fiber (Fig. 1D). This return, 


PURI ES 


ane i 
A. Relaxed muscle fiber 


(NOOSA 


B. Stimulated muscle; active and passive units bound 


C. Contracted position of active unit; passive unit pulled in with it 


eM EN 
(mers 


D. Relaxation of binding while active unit relaxes and returns to long (L) state; passive unit 
partly regains original length during this phase 


E. Beginning of new cycle with active and passive units again bound together 


Ficure 1. Diagrammatic illustration of the coupling and mode of action of the active 
and passive units of muscle. 


however, is not complete. Therefore, when the extended active unit is 
again bound to the passive unit, it binds more elements of the passive 
unit than it bound in the first instance. (In Fig. 1B three elements of the 
passive unit are bound to the contractile unit. In Fig. 1E five such ele- 
ments are bound.) Therefore the next contraction of the active unit re- 


sults in a greater shortening of the passive unit than the preceding con- 
traction gave rise to. 


OPTIMAL SYSTEMS: III. MUSCLE 317 


Binding of the passive units. We assume the passive units are long mole- 
cules with their long dimension oriented parallel to the long dimension of 
the fiber. As noted above, since the relaxed muscle can be easily stretched, 
we assume no strong inter- or intra-molecular binding forces holding the 
passive units. We might idealize this situation by means of the following 
simple model. We liken the relaxed passive units to a chain made of freely 
moving links. During stimulation the internal milieu of the chain is 
changed and this induces a freezing of the links in position. Now if we as- 
sume the muscle passive units to be fixed in position on stimulation, the 
question arises of how the muscle is able to contract on stimulation. We 
assume the contraction is due to a continuous working of the active units 
which act on and contract the passive units. The mechanism of this action 
will be elaborated in the remainder of this paper. 

Mechanism of contraction of the active units. We assume the active units 
can exist in two states, which we will call long (Z) and short (S) states. 
(We needn’t be concerned with the 6 state postulated to exist at very short 
lengths of the fiber; see Ramsey, 1950, for the meaning of this state.) The 
transition L — S is affected by the input of energy from a metabolite, 
and the transition S — L is spontaneous. However, we assume the transi- 
tion L — S takes place in the stimulated muscle fiber only when the active 
unit is bound to the passive unit. Then the active unit is able to effectively 
contract the passive unit. The return reaction (S — L) we assume is pre- 
ceded by an unbinding of the active and passive units. We further assume 
that the time of contraction and the time the active unit remains in the 
contracted (S) state are very short compared with the time of one com- 
plete cycle of the active unit. Therefore almost all the time of the cycle is 
spent by the units bound to each other and not contracted. 

All these aspects of the cycle of the active unit are illustrated in Fig- 
ure. 

Quantitative aspects of the contractile system model. In the following we 
will consider observations made on the muscle at or near rest length. For 
example the isometric tension, referred to as Piso, will always mean the 
isometric tension at rest length. Later in the paper we will consider pos- 
sible interpretations of the model by means of which it may be extended 
to long and short lengths and thus account for the length-tension diagram 
and other associated experimental findings. The reader should keep this 
in mind in reading the following. 

The active unit has two permissible states. Each state is characterized 
by a specific length. One associated with its extended (relaxed) and the 
other with its shortened (contracted) state. The difference between these 
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two lengths we call Ary. Thus Avy is the distance through which the 
active unit contracts. As noted above, we assume that the active unit pulls 
the passive unit this distance with each contraction. After the contraction 
the active unit releases the passive unit and they both relax. The active 
unit then reassumes its original relaxed configuration and is then again 
bound to the passive unit. During the relaxation time of the active unit, 
the passive unit is also able to relax. We assume that the return of the pas- 
sive unit is proportional to the instantaneous tension existing in the 
muscle. Then we may call the return of the passive unit aP, where P is 
the instantaneous tension per cross-sectional area existing in the muscle 
and a is a constant of proportionality. 

The quantity Ax, defined above is the dimensional change of the ac- 
tive unit during contraction. Since the passive unit is pulled through this 
distance with each contraction of the active unit and then tends to return 
to its original configuration during the relaxation part of the cycle (when 
the two units are uncoupled), we may define an effective contraction dis- 
tance for the passive unit. By this we mean the distance contracted by the 
active unit (Axj,) less the return of the passive unit during the relaxation 
phase. Calling the effective contraction distance Ax, we may define it by 
the following equation: 


Ax =Axy—aP. (6) 


We may note that at zero tension (P = 0) we find that 
Ax =Axy . (7) 


This we interpret as follows. With no tension in the passive units there 
is no return to the pre-contraction length of the transmissive unit during 
the relaxation phase of the active unit cycle (Fig. 1D). Thus the resultant 
shortening of the passive units (which is related to the observed shorten- 
ing) indicates the entire extent of shortening of the active unit. 

Similarly, let us consider the muscle stimulated isometrically. At iso- 
metric tension the effective shortening of the muscle is zero (Ax = 0). Thus 
at isometric tension (6) becomes 


JN) aP iso =i (ine 
This relation enables us to evaluate a, and we find 


AXn 
are ; AB) 
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And then (6) may be written as 


Ne AXy == au 


ak 
or 
IP 
Ax =Axy|1—5 i: (9) 


iso 


Next we will consider the rate at which the active units work. During 
muscular contraction the only factor that we consider as affecting their 
reaction rate is the tension of the transmissive units. We assume that in- 
creasing the tension in the passive units decreases the probability of the 
active units contracting. In quantitative form, if v is the average working 
frequency of the active units (i.e., the average number of cycles described 
in Fig. 1 per unit time) at any tension P per unit cross-sectional area, and 
v. is the turnover frequency at zero tension (the maximum working fre- 
quency), then 

BV=yELeF , (10) 
in which @ is a constant. 

The exponential form of the relation between turnover frequency and 
tension assumes that there is a one-dimensional barrier to the transition 
of the active unit into the contracted state. Then the Arrhenius energy of 
activation is increased by an amount proportional to the observed tension, 
and this fact is expressed by equation (10). 

Equations (6) and (10) are the two important equations of our pro- 
posed model. With them we may now proceed to some simple interpreta- 
tions of experimental findings. Later in this paper, when considering the 
heat effects of contraction, new assumptions concerning the heat rates of 
reactions will be necessary. Before complicating our model in that fashion 
we will consider some simple kinetic results of muscular contraction which 
may be elaborated with the mechanism now available. 

A. The velocity of contraction. 

The velocity of contraction, which we call v, may be obtained as the 
product of the turnover frequency of the active units, the number of con- 
tractile units in the muscle, and the distance each contraction of an active 
unit shortens the muscle, which we call AX. Then 


v=v-enlA-AX , ea) 


in which is the number of active units per unit volume of muscle and /A 
is the volume of the muscle. 

The quantity AX may be related to the Aw of equation (9) in the follow- 
ing way. The Ax we have obtained in (9) represents the distance by which 
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one working cycle of the active unit shortens the particular passive unit 
on which it acts. However, a muscle or muscle fiber consists of many pas- 
sive units in parallel. Assuming muscle tissue to be homogeneous, we may 
assume that the number of passive units per unit cross-sectional area is 
constant. Call this number V. Then the total number of passive units in 
parallel in the muscle is VA, in which A is the cross-sectional area of the 
muscle. It is then reasonable that with this number of passive units 
in parallel a contraction of all passive units through a distance Ax will 
contract the entire muscle the same distance. We may then say that on 
the average the amount a single contraction of an active unit effectively 


a Oeelroe 


P—> 
Ficure 2. Theoretical velocity-tension curve 


contracts the muscle is only a small fraction of the amount it contracts 
the passive unit on which it acts. In particular, 


Inserting (9) in (12) we obtain 
ax =5-e[1-s-]. (13) 


Finally, substituting in (11) expressions (10) and (13) we find 


v (P) = vpn Arye e-#? | 1 — sak (14) 
The graphical form of this relation is shown in Figure 2. 

The experimentally determined velocity-load relation is similar in form 
(Buchtal and Kaiser, 1951, p. 150). After considerations of heat evolution 
during muscular contraction we will be able to consider this relation in 
greater quantitative detail and make comparisons with experimentally 
determined relations. 
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B. Heat rates. 

Muscle can do work on contraction. In our model the passive units are 
moved by the active units. Thus the energy for contraction comes from 
the active units. These in turn must derive their energy from some 
source. This is known to be energy derived by the degradation of metabo- 
lite molecules or chemical bonds from states of high to low chemical po- 
tential energy. The carrier of this chemical potential energy has been 
shown to be the high energy phosphate bond. However, we are little con- 
cerned with the specific biochemistry of muscular metabolism. We are in- 
terested in making a broad and non-specific assumption concerning the 
energy input to the active unit. This is that each time the active unit 
contracts it degrades one unit of metabolic energy. The energy that is so 
released we call AF’. Once this energy is released from the metabolite car- 
rier by the active unit we assume that no part of it can be reconverted to 
chemical potential energy. (This assumption is contrary to a hypothesis 
presented by Needham, 1950, in which he supposes that the energy supply 
to muscle is by means of a reversible chemical reaction. During contrac- 
tion the heat given off is partly activation heat and partly heat of con- 
traction. During slow tetanic stretch the heat may be less than isometric 
and this he supposes may be due to normal activation heat and negative 
heat of contraction. Thus he allows for the possibility of a negative heat in 
the working of the contractile system. This is equivalent to a reversal of 
the reaction whereby energy is normally given to the active unit.) 

Since we assume the energy from the metabolite molecule is all degraded, 
it must all appear in the work done by the active unit and the heat gen- 
erated by the muscle. The work AW done by each working of the active 
unit is 

AW=PA-AX, (15) 


in which PA is the tension against which the unit shortens the muscle a 
distance AX. Substituting (13) in the above equation we find 


aw ==S2é/1 = s | 


—— 16 
N P20 ( ) 
This expression gives the work done per working of the active unit in terms 
of the tension P against which the active unit works. 

With the above expression for work we may obtain the heat generated 
per turnover of an active unit. If we call this heat AH, then 


AH =AF-—AW. (17) 
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Substituting AW from (16) in the above we obtain 


Nine ar —PAt (1 — 5 -). (18) 


Finally, we may derive from the above equations an expression for the 
time rate of heat generation. If we call this time rate (AQ)/(At) we find 


aes =ao/NEL (19) 


Substituting (10) and (18) in the above we obtain 


oe [me] [AP — aoe me (1 - Slt (20) 


This concludes the presentation of the basis of our model. We have 
postulated the effective distance of contraction of an active unit for a 
single contractile cycle [eq. (9)], the rate of working of an active unit 
[eq. (10)], the velocity of contraction of muscle in terms of the above [eq. 
(14)], the work output and heat production for a single working of the 
active unit [eqs. (16) and (18)], and the rate of heat production [eq. (20)]. 
The only affirmation of our model has been obtained in a qualitative man- 
ner from the shape of the expected load-velocity relation (Fig. 2). 

Since we have postulated mechanisms involving observable quantities, 
it is desirable at this point to consider a comparison of our model with 
experimental data. 

C. Comparisons with experimental data. 

The basic system of our model we assume to be unique in the sense that 
it is of very similar form in many vertebrate forms. This system is the 
active unit together with its passive unit (the actin, myosin system). 
However, we know that there are differences among different types of 
muscle mainly exhibited by the rate at which they work, i.e., slow and 
fast muscles. These differences we attribute to the various possibilities of 
aggregation of these units as well as possible metabolic differences. In 
other words, the contractile system is set in pattern and this basic system 
can fit various functions by alteration of the metabolic pathways supply- 
ing it. With this in mind we may proceed to comparisons of our model 
with experimental data. 

Maximum isometric tension. Equation (6) determines the maximum ten- 
sion in the passive unit. This is the tension at which the dimensional 
change of the active unit (Awy) is just balanced by the return of the pas- 
sive unit (aP) during the return of the active unit to its extended state. 


OPTIMAL SYSTEMS: IIT. MUSCLE 323 


Since a is a constant characteristic of the binding of the contractile system 
and its surroundings, we assume it not to be dependent on the type of 
muscle. 

Tension is exerted by the many passive units in the muscle acting in 
parallel. If there are VA passive units, then 


P-A=NAp>, (21) 


in which # is the tension in each unit. However, from (6) the maximum 
tension py in each passive unit is 


AX 


Pu — : 
a 


And since a is constant this maximum tension is not dependent on the 
type of muscle. Then 
PA=NA SSH, 


Se (22) 
ca a 

Since we assume a constant cross-sectional density of passive units, then 
N = kA, where A is the cross-sectional area of the muscle and & is a con- 


stant. Substituting this in (22) we obtain 


Pe Nee (23) 


a 


This equation expresses the fact that the maximum tension per unit cross- 
sectional area is independent of the type of muscle. This is a well-known 
experimental result (Hill, 1949). 

Velocity load relation. Before considering equation (14) above, we must 
first evaluate the parameter 8 which can be done from considerations of 


heat productions. 
Consider the relation [eq. (20)] between heat rate and tension 


During free release of the muscle we find 
Ae(P=0) = AF . (26) 
When held isometrically we find 


Bo (P= Pus) =ne PPivoak , (27) 
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Thus the ratio of isometric heat to heat produced when the muscle is 
contracting with maximal velocity is 


FiPe Bae) BP 

pam ee eee Sg IL (28) 
a6 
Midie ae 


For reasons that will be apparent in the later development of this section 
we will change our notation in the following manner: 


vf 
dient ie 29 
pap (29) 
in which y is a dimensionless constant. Then (28) may be rewritten 
eran 
AO ——=e7, (30) 
AE (P = 0) 


Hill (1938) reports that the rate of energy liberation at zero load is 
about five times as large as the isometric heat rate. However, this does 
not allow for the fact that some heat is given off by the incidental processes 
that keep the muscle in a state of excitation, and also that during free 
release of the muscle (zero load) the active units are actually working un- 
der a small tension which is necessary to pull the muscle, and so are not 
working at their maximum rate. If we let y = 2, then substituting this 
value in (5) gives us 

= (P = Pico) 1 
=5. (31) 


AQ (pn _ 
ay ee) 


With this value of y we may substitute in our velocity load relation 
[eq. (14)] and obtain the following values, which we present compared 
with some of Buchtal’s (1951) experimental values in Figure 3. 


The heat of shortening. Heat is generated by the muscle at a rate gov- 
erned by equation (20) 


A 
ol ee —2(P/ Pigg) ARS —pa¥ (1-2 )), (32) 


7) 


Let us consider a muscle contracting under a tension P through some dis- 
tance Al. The velocity of contraction is given by equation (14) 


v0 (P) =ynlAxy e 2?! Piso (1 es y (33) 


180 
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Thus the time At necessary for the contraction to take place is 


Al 
— 
Ss aP) 
= = ed 
VMAX é ee (1 5) 
Then the heat H generated during the contraction is 
_ AQ 
Se At ) 
ar—pS%™(4_- ) (35) 
N iso 
= = Al. 
MAX é = ) 
3.0 
25 
20 
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Ficure 3. Comparison of theoretical and experimental velocity load relations 


fitted through point 
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However, when considering the heat of shortening, which is the extra 
heat generated during the shortening, we must subtract from the total 
heat given off during the shortening the heat that would have been gen- 
erated during an equivalent time had the muscle been held in isometric 
contraction. This equivalent isometric heat is 


e*AFAl 
eo 2! P/ Piso’ ‘ (36) 
n-l-Axy (== je 


iso 


AQT Ke 
aie = Piso) Al 


Thus the extra heat generated during the contraction is given by the dif- 
ference between equation (35) and equation (36) 


—< (P= Pye) Al 


et P! Piso) NN eg 2 \P! Piso) ee — P -\ai— e AF Al (37) 


nlAxy e —{P/ Piso) (ee F) 


To investigate the dependence of this expression on P we must have some 
idea of the order of magnitude of AF. This must be about the size of 
AxtuP iso 
N Sar ae 
One reason for this is that muscle is very efficient and when working at 
maximum efficiency (in the region of P = }P,,.) dissipates an amount 


of heat about three times the work it is doing. Since the work done at this 
tension is 


1 AxtyP iso 
4 Na 


we find that the total energy given out (which is the same as AF) is 


AXP iso 
ny gaa 


Substituting this value of AF in the above equation we obtain 


2 UNS) 
om |a-fe-= Pi.) At] = ; € 
1 


wae 
Pas P Pasa £33) 


Jove 


Values of the right side of equation (38) have been tabulated in Table I. 

Thus our model has an approximately constant heat of shortening (to 
within 10% of the total heat) for shortening with loads of less than .8 
times the isometric load. This has been experimentally shown in the work 
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of Hill (Proc. Roy. Soc. B, 126, p. 136-95, 1938). The larger heat at larger 
load has been indicated experimentally but is not too important, for then 
the muscle has a long contraction time, and it is difficult to experimentally 
determine its actual excess heat of shortening.. 


Further Considerations of the Simple Model 


The effect of changing temperature. Many of the kinetic properties and 
heat rates of muscle vary when the temperature of the muscle is changed. 
In this section we will consider some of these variations. 


TABLE I 


P Extra heat of shortening 
(100% assigned to extra heat 
Jeny. per unit shortening when 


VP 
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0 100% 
93% 
90% 
90% 
99% 
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A. The maximal tsometric tension. 
This tension is determined from equation (6) which we rewrite here 


Ax =Axy—aP. (39) 


The second term on the right is the return of the transmissive unit during 
the period of the cycle when it is not coupled to the contractile unit. This 
term may be written 


oP=K—P, (40) 


in which 7 is the time during which the two units are uncoupled, 7 is a 
viscosity term giving the viscosity of the medium through which the pas- 
sive unit must move, and K is a constant not depending on temperature 
or tension. 

Inserting equation (40) in equation (39) we obtain 


Av =Aty—K—P, (41) 


which, in the isometric case, reduces to 
0=Aty—K hy ee 
q 
or 40 
AX . (4 2 ) 


rte 
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We find that Ax is the length change of the active unit and this change 
we expect to be effected very little by temperature. The viscosity 7 has 
a small temperature coefficient. 

Next we must consider how the relaxation time of the active unit varies 
with temperature. Since the contraction of the active units is a relatively 
rare event, we assume the units have a high activation energy in order to 
go from the extended to the contracted state. In turn we may assume a 
low activation energy for return to the extended state after contraction 
since the return occurs very soon after contraction. Thus the time 7, dur- 
ing which the active and passive units are not coupled, has a small tempera- 
ture dependence. At present we cannot say whether increasing the tem- 
perature will increase or decrease, but the assumption of a small activation 
energy for the relaxation process means that there will be only a small 
temperature dependence. Thus we see from equation (42) that the iso- 
metric tension will have a small temperature dependence. This has been 
found to be true. The temperature coefficient of isometric tension has been 
found to be of the order of 1.12 (Ramsey, Aun. N.Y. Acad. Sci., 47, 675, 
1947). 

Summary. The engineering of a biological system presents serious diffi- 
culties placed by limitations of the structural materials available. When 
the vascular system was treated, these limitations were accepted and not 
analyzed. In the present example of the muscular system more serious 
restrictions are imposed by the materials comprising the structure of the 
system since some of the most important physiological properties of the 
system are imposed by the properties of these materials. In investigating 
the muscular system it thus seemed wise to begin by considering a model 
of the contractile mechanism itself. This has been done utilizing some 
known general properties of biological systems which we believe impose 
more important requirements on the molecular system than the miscel- 
laneous microscopic facts accumulated in experimental work on muscle. 

Considering the probable specificity of a single reaction involving the 
dimensional change of a molecule combined with the required reversibility 
of the reaction, a system has been proposed involving the use of two units 
in the active mechanism: an active unit and a passive unit. Simple kinetic 
assumptions applied to this two-component system give theoretical pre- 
dictions which show promising correspondence with experimental results. 

Conclusion. The results obtained give promise that several phenomena 
connected with the workings of the contractile system of muscle may be 
represented by a coupled two-component system. The dependence of a 
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biological system on the very specific properties of a single molecular 
structure (i.e., the two permissible states of the active unit) has also been 
illustrated. Now, realizing the limitations placed on the muscular system 
by the contractile system itself, we may in the future proceed to consider 
along the lines of an optimal systems analysis possible alterations of this 
basic system to best suit certain functions. 

Professor H. D. Landahl has suggested that the present model is a par- 
ticular example of a general type of model. It might be of interest to in- 
vestigate more general conditions of coupling between a molecule under- 
going a specific chemical reaction inducing a dimension] change and the 
more passive system it acts on. 

This work was aided in part by a grant from the Dr. Wallace C. and 
Clara A. Abbott Memorial Fund of the University of Chicago. 
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The principal theme of L. J. Savage’s Foundations of Statistics is the notion of “personal 
probability” as the primary concept of probability theory. Since statistical theory is erected 
on that of probability, it is really the foundations of the foundations with which Dr. Savage 
is concerned. He approaches the subject with due caution and humility, justly pointing out that 
the ‘‘foundations”’ are most frequently the most controversial area of any science. Why this is 
so, he does not venture to speculate, but one might guess that this situation is an instance of a 
commonly observed feature in human affairs: men often find it easier to agree on what to do 
than on the reasons for doing it. 

Savage mentions two other views with regard to the nature of probability notions, the 
“objectivistic,’’ as exemplified in the treatment of the subject by R. von Mises, and the “‘neces- 
sary,” represented by R. Carnap and J. M. Keynes. The former takes probability to beinherent 
in events, revealing itself as a limiting frequency of an event among several possible ones. The 
latter views probability as an extension of the logical notion of implication, providing a quanti- 
tative measure of the extent to which one assertion implies another. To these well-established 
views, Savage proposes an alternative in the notion of ‘‘personal’’ probability (proposed also 
by B. de Finetti), where the point of departure is a ‘‘degree of belief,’ which a person assigns 
to an event. Savage further points out that in neither of the two other approaches can any pre- 
cise quantitative meaning be given to the ‘“‘degree of belief,’’ whereas it seems desirable to 
have such meaning for a normative theory of action (including a theory of statistical decision). 
It is clear, therefore, that for Savage the foundations of statistics are interwoven with behavior- 
istic notions, and he repeatedly stresses this point. 

Foundations of Statistics begins with a system of seven postulates and appropriate defini- 
tions of the terms involved. Characteristically, these terms include not only those which are 
fundamental in the prevalent ‘‘objectivistic’’ approach (such as ‘‘event,’’ union and inter- 
section of events, etc.) but also “person,” ‘‘state of the world,” ‘‘act,”’ preference among acts, 
etc. The notion of decision or choice among acts is involved in the very first postulate, which 
states that for a given person there exists a simple ordering of acts in terms of preference. 

This postulate is by no means a weak one, if one keeps in mind that acts, as defined by 
Savage, are actually functions, whose arguments are “states of the world”’ and whose values 
are the consequences of the acts. Since typically the ‘‘state of the world’’ is unknown to the 
person acting, the postulate of simple ordering assumes that the person reasons thus: “Tf I per- 
form the act f, and the world is in state s, the consequence will be f(s); if I do f, and world is 
in s’, there will result f(s’); if I do f’, and world is in s, ’(s) will obtain, etc. . . . I therefore 
prefer f to f’.”’ 

To be sure, the individual does not consider all possible states of the whole universe but 
only those of a ‘“‘small world’’ relevant to the situation. Nevertheless it is important to note 
that a simple ordering among acts implies the establishment of preferences not among events 
or objects but among functions and so cannot rest on any obvious relation among them, except 
in the trivial case where al/ the consequences of one act are preferred to all the consequences 
of another.* 


*In the definition of ordering, Savage uses the weaker concept, ‘‘not preferred to”’ (like 
less than or equal to), but I omit this refinement to simplify the discussion. 
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The second postulate, based on the so-called ‘‘sure thing”’ principle, can be stated in a vari- 
ety of ways. All of Savage’s postulates are stated entirely formally, as mathematical postulates 
should be. However their intuitive meaning is discussed at length, and, in fact, the discussion 
of the postulates forms the bulk of the first five chapters of the book. Roughly the sure thing 
principle implies that knowledge of an event which cannot affect the consequences of one’s 
choice should not affect the choice. Entirely obvious as this principle seems for a “‘common 
sense’”’ point of view, it does not apply in situations where choices are made by groups of 
individuals, as will appear below. Therefore the principle may be taken as part of a definition 
of a rational ‘‘individual”’ as distinguished from a group. 

The third postulate defines the null event as one in the light of which all acts appear in- 
different. 

The fourth postulate, which is essentially the definition of (personal) probability, assumes 
that acts are chosen not only with regard to the consequences for a given state of the world but 
also with regard to how the world is partitioned into the states. If we may use everyday lan- 
guage, this means that given the same prize for guessing an event, the person will, in general, 
prefer to guess one event rather than another. This preference is, of course, the degree of his 
belief that one rather than the other event will occur. 

The fifth postulate states that not all acts are indifferent. 

The sixth postulate establishes the possibility of defining quantitative personal probability 
by asserting that it is possible to partition the world into states with sufficiently small proba- 
bilities (the ordering of the probabilities having been defined in the fourth postulate, so that 
‘small’? means ‘‘than which there is no smaller’’). The formal discussion of this postulate is 
quite involved and gives rise to refined mathematical notions, comparable to the subtle notions 
which must be invoked in the theory of functions of a real variable in order to establish with 
rigor the logical foundations of infinitesimal analysis. 

Finally the seventh postulate involves the establishment of a ‘‘utility’’ function and de- 
mands that if all the consequences of one act are preferable to those of another (in terms of the 
utility), then the first act should be preferred to the second. (Preferences among consequences 
can, of course, be defined in terms of preferences among acts, which are constant functions of 
the states of the world. It is important to note, however, that the point of departure for Savage 
is preference among acts, not consequences. The latter is a derived notion.) 

The seven postulates completely reveal the behavioristic orientation of the approach to 
probability taken in Foundations of Statistics. By behavioristic I mean operational in the sense 
that acts and their results are taken to be the content of knowledge. Not because a certain 
event is more probable than another does a person prefer to bet on it, but because he bets on it, 
we infer that (to him) the event is more probable. In this sense, the re-establishment of an un- 
ambiguous (except for a linear transformation) utility function, as achieved by J. von Neu- 
mann and accepted by Savage, is based on the behavioristic approach, where preference in the 
face of uncertainty reveals the nature of an individual’s utility function. 

In contemplating Savage’s treatment, the question naturally arises whether the ‘‘personalis- 
tic view”’ is any less ‘‘objectivistic’’ than the widely current objectivistic view, which claims 
that it starts from a probability defined by the outcomes of repeated experiments. It seems to 
me that the determination of probabilities in the personalistic sense is equally empirical, except 
that the experiments are different. Instead of tossing coins, the personalist asks people to bet 
on outcomes. Just as a certain long-run regularity in the behavior of the coin is necessary for 
the objectivistic establishment of the probability of “‘heads,” so a certain regularity (consist- 
ency) in the behavior of a person is required to establish the degree of his belief in an event. 

From this point of view, it seems to me that the “‘personalistic theory” does not differ in 
method from the ‘‘objectivistic” one but only in the choice of the universe in which events are 
to be considered as fundamental in the establishment of certain assertions. The universe of the 
personalistic theory consists of persons and their decisions under a variety of conditions rather 
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than of inanimate objects and their behavior under a variety of conditions. However, con- 
trolled experiment seems as fundamental for the one as for the other. 

There is, however, another important facet to the personalistic theory which Savage holds 
to be central: the normative aspect of the theory. He is careful to point out that a normative 
theory does not describe how people do in fact act but how they should act if they wish to 
adhere to certain norms of consistency. It seems to me that this normative aspect is inherent 
in all theory. For a theory must contain deductions. And deductions are typically normative 
statements: if you believe this, you should also believe that. 

To take an example from the early years of probability theory, what bothered Chevalier de 
Mére was a notion of how dice should behave. If he was able to win in the long run by betting 
even money that an ace will occur on or before the fourth throw of a single die, he felt he should 
be able to win if he bet that a double ace will occur on or before the 24th throw of two dice. 
Pascal showed that the flaw was in Chevalier de Mére’s logic and not in his luck, that granted 
the assumptions which imply a win in the first case, the same assumptions imply a loss in the 
second. Therefore, Pascal argued in effect, de Mére should lose in the long run on the double 
ace bet. Now how does this problem appear in personalistic language? If you prefer to bet in a 
certain way on the throw of one die, then you should prefer (to be consistent) to bet in a certain 
way on the throw of two dice. 

Both conclusions are normative conclusions. The difference is in the events to which the 
conclusions apply: the behavior of the dice in the one case, of the person in the other. 

It seems, therefore, that the real difference between the personalistic theory and the objec- 
tivistic does not lie in the circumstance that the former avoids the difficulty of an empirical 
definition of probability (personal probability, when it comes down to it, is also empirically 
defined); nor in the circumstance that the personalistic theory is primarily normative (all 
theories are normative); but rather, and significantly, in that the personalistic theory does give 
operational meaning to the psychological facets of probability and to probabilities of events 
which cannot be repeated, which the objectivist theory, as stated by its chief proponents, 
does not do. 

The claim to greater generality of the personalistic theory thus seems to be justified. It 
also appears, as Savage implies, that the ‘‘necessary’’ theory is a special case of the personalis- 
tic, where everyone is assumed to have equal a priori personal probabilities about certain 
events. However, since Savage in the main confines his critique to the comparison of the per- 
sonalistic and the objectivistic views, a discussion of the relation between the “‘necessary”’ (or 
logical) and the personalistic approaches to probability is beyond the scope of this review. 

The behavioristic flavor of the personalistic approach to probability places statistics into 
intimate relations with those branches of modern behavioral science where the center of interest 
is the decision process. It also provides a bridge between statistics and the mathematical biol- 
ogy of behavior. One notes, for example, that the conditioning, learning, and discrimination 
models of Rashevsky and Landahl contain ‘“‘built in” personal probabilities in the form of 
thresholds and excitation biases as determinants of behavior. Both Rashevsky in his theory 
of mass behavior and Landahl in his classification of gambling patterns begin by assuming dis- 
tributions of preferences in the population, which Savage formalizes in his first postulate. Such 
preferences, implicit in costs of observations and costs of errors, appear also in the work of 
mathematical statisticians whom Savage considers objectivists, notably in the work of the 
late A. Wald, to which Savage devotes special attention. Likewise the theory of the two per- 
son game, as developed by von Neumann, Morgenstern, Blackwell, Girshik, Shapley, et al. is 
discussed in considerable detail. Curiously, in his introduction Savage considers his discussion 
of partition problems, sufficient statistics, likelihood ratios, etc., as relevant to the ‘‘founda- 
tions’? of statistics, while labeling the chapters dealing primarily with minimax problems and 
other aspects of game theory as concerned with ‘‘statistics proper.” It seems to me that the first 
five chapters, which deal entirely with the axiomatic basis of the personalistic view, actually 
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establish the foundations, while the entire discussion of observations, inferences, and what 
amounts to behavior strategies constitutes the edifice of statistics in the light of the behavioral 
point of view. The emphasis upon game theory is certainly entirely proper, since according to 
the behavioral view the heart of probability theory is in the guidance it provides for consistent 
action in the face of uncertainty. In this connection, Savage’s discussion of the minimax theory, 
where many current misconceptions are laid bare, is illuminating. 

Especially interesting is Savage’s personalistic interpretation of the minimax theory as ap- 
plied to group decision. As is well known, the minimax rule is a principle of decision making, in 
which the greatest possible loss is calculated for each of the possible courses of action (depend- 
ing on the states of the world) and this loss is then minimized with respect to the course of ac- 
tion chosen. It is shown in the theory of games that the minimax is the best strategy in a two 
person zero sum game, provided ‘‘mixed acts”’ (where decisions are determined by a chance de- 
vice) are allowed. Savage shows that the minimax policy can be formally applied to group de- 
cisions where, however, the variables have a different interpretation. One calculates the maxi- 
mum loss over the individuals in the group for each act and then minimzes this maximum over 
all the acts. Roughly a minimax group decision is one in which no member is forced to face a 
very large loss. Although supporting the minimax rule, Savage considers it “‘flagrantly undemo- 
cratic,’’ presumably because it often violates the principle of majority rule. It seems to me, 
however, that the minimax rule embodies a much more important principle of democratic de- 
cision, namely, a respect for individual rights. Majority rule, when applied to the suppression 
of minority rights, perverts rather than serves what many consider to be the ideal of democracy. 

Savage offers the minimax rule as a “‘rule of thumb”’ for group decisions (no more, no less 
than majority rule or any other specifiable rule). However, the implications are extremely far 
reaching and have instigated lively discussion concerning the possible meanings of the ‘‘best’’ 
decision, where the interests of several individuals are involved (already anticipated in Rashev- 
sky’s Mathematical Biology of Social Behavior). In particular, where group decision is involved, 
the order of preference between two acts may be reversed as a consequence of the availability 
of a third act. Such reversals seem absurd in the case of individual decision. To take Savage’s 
example, this would amount saying to the butcher, ‘‘Seeing that you have geese, I’ll take duck 
instead of chicken.”’ Certainly if an individual prefers chicken to duck, the mere availability 
of geese should make no difference in the order of that preference. Now, to continue Savage’s 
example, it is not at all strange for a banquet committee to compromise on duck when goose 
is available when otherwise they would have ordered chicken (to appease the dark-meat 
fanciers, who, in the absence of goose, would have considerably less to say). The example 
points up vividly the additional dimension in group decision and brings to mind the thought 
provoking work of K. Arrow, J. Marshak, and others in that field. It seems to me, as it does to 
Savage, that the consideration of these problems as intimately associated with those of sta- 
tistics and probability has been given an impetus by the personalistic point of view. 

On the other hand, the behavioristic revision of estimation seems amply anticipated in, say, 
the work of Wald, resting, as it does, on the evaluation of the costs of consequences of esti- 
mates. These being given a priori and peculiar to a situation seem to me already to reflect the 
personalistic point of view, if it is stressed that the probabilities of the consequences are taken 
as ‘‘personal probabilities.”’ 

Savage has amply shown, as he set out to do, that the personalistic view of probability leads 
to the same applied mathematical statistics as does the objectivist view and is therefore equally 
justifiable on pragmatic grounds. But it does more in that it extends the concept of probability 
to areas beyond the scope of the objectivist view. It seems strange to me, however, that in view 
of the almost universal agreement on the axiomatic system which underlies the purely mathe- 
matical content of probability, as formulated, for example, by A. Kolmogoroff, Savage does not 
assert that the postulational approach could be taken as the most general of all. It is true, as 
he points out, that almost all controversy concerning ‘‘foundations”’ ‘‘centers on cucetons of 
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interpreting the generally accepted axiomatic concept.”’ But is not this controversy in the last 
analysis an argument about which class of phenomena to consider primary and which second- 
ary? In other words, is this not the tail end of the tedious, doctrinaire materialist-idealist quar- 
rel, in which Savage takes the side which by materialists would be dubbed as “‘idealist”’ (with 
no more justification than the epithet ‘‘idealist”’ is attached by Soviet philosophers to physicists 
who begin with operational definitions of physical variables)? 

Kolmogoroff in his now classical monograph Foundations of the Theory of Probability is quite 
explicit in drawing the parallel between the axiomatized system which he proposes and the 
axiomatization of geometry by Hilbert (since Hilbert all branches of mathematics have be- 
come strictly axiomatized). It is true that Kolmogoroff immediately interprets his system in 
terms of the traditional objectivist view, but if the axiomatic foundations are as sound as they 
appear, are not other interpretations equally justifiable philosophically speaking? Is not the 
only question whether this or that portion of the universe (whether outcomes of experiments 
or human behavior) is adequately described by a proposed model. This question seems to me 
to be an empirical question. I am not minimizing the normative aspects of theory stressed by 
Savage. Iam merely pointing out that the normative aspects present no philosophical problem. 
Being purely deductive, they may be derived from an arbitrary set of assumptions. Thus if one 
means by ‘‘foundations”’ the axiomatic base of a system, then any set which leads to the same 
theorems is as good as any other. If, however, one means by foundations the fundamental rela- 
tions of the theoretical system to the real world, then one must specify which part of the world 
one is talking about. All this is obvious in physical statistics. If a class of particles does not 
obey the Maxwell-Boltzmann statistics, one tries the Bose-Einstein or the Fermi-Dirac with- 
out quarreling about which model is ‘‘true’’ in any transcendental sense. Does Savage’s thesis 
need any more justification than an appeal to the situation in physics? These remarks are 
meant, of course, not as a criticism of Savage’s arguments but as a surprise that the argument 
should need to be so energetically defended at this late date. 

On the other hand, additional stressing of the postulational nature of all theoretical inquiry 
would have served Savage well in anticipating possible objections. For example, when he com- 
pares decisions of people having the same utility functions but different probability estimates, 
it may be objected that the same behavior may be deduced from an alternative hypothesis of 
identical probability estimates but different utility functions. It would have added poignancy 
to Savage’s discussion if he raised the question of whether utility functions and probability 
estimates can be disentangled, and, if it were found that they could not be, to appeal to the 
principle of indifference among models which lead to identical conclusions. 

T believe most readers will find Savage’s literary style elegant and entertaining. The parallel- 
ism of informal and formal exposition provides welcome breathing spells and helps to drive 
important points home. The homely, simple-minded examples suggest the common-sense 
origin of even the most abstract ideas of modern statistics and point up the power of abstract 
treatment. Thus the dependence of the cost of an observation on the value observed is illus- 
trated by testing the sharpness of a thorn with one’s finger; negative cost of an observation 
by a cook tasting the broth; the non-infinite negative utility of immediate death by a man 
crossing the street to greet a friend; a sequential program of observation by the fact that we 
look for an object until we find it and no longer; action in the face of uncertainty by a decision 
of whether to break a questionable egg into a bowl with good eggs. This book, although by no 
means easy, is remarkably remunerative to one willing to exert a determined effort to under- 
stand it and is effective in clarifying many extremely involved matters of statistical theory 
at least for the mathematically mature reader. 
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ERRATUM 


To the paper by Anatol Rapoport entitled “‘Application of Information 
Networks to a Theory of Vision” (Vol. 17, pp. 15-33). 

In the above paper, the first word in the ninth sentence of the first 
paragraph should read “‘former,”’ not “‘latter.”’ 
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